CSV Formats
CSVFormat	Description	Since Version
DEFAULT	IO for the Standard Comma Separated Value format, like RFC 4180 but allowing + empty lines. +	1.0
EXCEL	IO for the Microsoft + Excel CSV. format. +	1.0
INFORMIX_UNLOAD	IO for the Informix UNLOAD TO file_name + command. +	1.3
INFORMIX_UNLOAD_CSV	IO for the Informix UNLOAD CSV TO + file_name command with escaping disabled. +	1.3
MONGODB_CSV	IO for the MongoDB CSV `mongoexport` command. +	1.7
MONGODB_TSV	IO for the MongoDB Tab Separated Values (TSV)`mongoexport` + command. +	1.7
MYSQL	IO for the MySQL CSV format. +	1.0
ORACLE	IO for the Oracle CSV format + of the SQL*Loader utility. +	1.6
POSTGRESQL_CSV	IO for the PostgreSQL CSV format used by the `COPY` + operation. +	1.5
POSTGRESQL_TEXT	IO for the PostgreSQL Text format used by the `COPY` + operation. +	1.5
RFC4180	IO for the RFC-4180 format defined byRFC 4180. +	1.0
TDF	IO for the Tab Delimited Format (also known as Tab Separated Values). +	1.0

- - - + + + - - - + + +

commons-csv-1.4-bin.tar.gz	md5	pgp	commons-csv-1.14.1-bin.tar.gz	sha512	pgp
commons-csv-1.4-bin.zip	md5	pgp	commons-csv-1.14.1-bin.zip	sha512	pgp

- - - + + + - - - + + +

commons-csv-1.4-src.tar.gz	md5	pgp	commons-csv-1.14.1-src.tar.gz	sha512	pgp
commons-csv-1.4-src.zip	md5	pgp	commons-csv-1.14.1-src.zip	sha512	pgp

diff --git a/src/site/xdoc/index.xml b/src/site/xdoc/index.xml index 3238804102..ac5b8cfa9f 100644 --- a/src/site/xdoc/index.xml +++ b/src/site/xdoc/index.xml @@ -7,7 +7,7 @@ The ASF licenses this file to You under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at - http://www.apache.org/licenses/LICENSE-2.0 + https://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, @@ -18,90 +18,84 @@ limitations under the License. Home - Commons Documentation Team + Apache Commons Team + +

+ -

Commons CSV reads and writes files in variations of the Comma Separated Value (CSV) format.

The most common CSV formats are predefined in the CSVFormat class: -

Microsoft Excel
Informix UNLOAD
Informix UNLOAD CSV
MySQL
RFC 4180
TDF

Custom formats can be created using a fluent style API.

Read the documentation starting with the Javadoc Overview.

An overview of the functionality is provided in the -user guide. +user guide. Various project reports are also available.

The Javadoc API documents are available online:

-The git repository can be -browsed. +The git repository can be +browsed.

Apache Commons CSV 1.4 (mirrors) requires Java 1.6
Apache Commons CSV 1.3 (archives) requires Java 1.6
Apache Commons CSV 1.2 (archives) requires Java 1.6
Apache Commons CSV 1.1 (archives) requires Java 1.6
Apache Commons CSV 1.0 (archives) requires Java 1.6
Download Apache Commons CSV current (mirrors), requires Java 8 or above
Download Apache Commons CSV archived releases

See the -Download Page -for the latest releases.
+Download Page +for the latest releases.

-Change reports are also available. +Release History are also available.

-For previous releases, see the Apache Archive +For previous releases, see the Apache Archive

- Alternatively, you can pull it from a Maven repository: -

<dependency>
-    <groupId>org.apache.commons</groupId>
-    <artifactId>commons-csv</artifactId>
-    <version>1.3</version>
-</dependency>

- For other dependency access methods, see Dependency Information + For dependency access methods, see Dependency Information

The latest code can be checked out from our git repository at https://git-wip-us.apache.org/repos/asf/commons-csv.git. +

The latest code can be checked out from our git repository at https://gitbox.apache.org/repos/asf/commons-csv.git. You can build the component using Apache Maven using mvn clean package.

- +

+ Apache Commons CSV requires Java 8 or above. +

+ + + + + + + + + + + + + + + +

Commons CSV	Java	Android
1.10.0+	8	Android 7.0 (API level 24)

The commons developer mailing list is the main channel of communication for contributors. Please remember that the lists are shared between all commons components, so prefix your email by [csv].

You can also visit the #apache-commons IRC channel on irc.freenode.net or peruse JIRA. Specific links of interest for JIRA are:

You can also peruse JIRA. Specific links of interest for JIRA are:

Ideas looking for code: Patch Needed
Issues with patches, looking for reviews: Review Patch

TagList report

If you'd like to offer up pull requests via GitHub rather than applying patches to JIRA, we have a GitHub mirror.

The commons mailing lists act as the main support forum. @@ -123,22 +116,18 @@ For previous releases, see the

Commons CSV was started to unify a common and simple interface for reading and writing CSV files under an ASL license. It has been bootstrapped by a code donation from Netcetera in Switzerland. There are three pre-existing BSD compatible CSV parsers which this component will hopefully make redundant (authors willing):

Skife CSV
Open CSV
Genjava CSV
Skife CSV (unsafe link as noted by browser https://kasparov.skife.org/csv/)
Open CSV
Genjava CSV

In addition to the code from Netcetera (org.apache.commons.csv), Martin van den Bemt has added an additional writer API.

Other CSV implementations:

Super CSV
Super CSV

- - diff --git a/src/site/xdoc/issue-tracking.xml b/src/site/xdoc/issue-tracking.xml index c15ab0e651..3aa64b4042 100644 --- a/src/site/xdoc/issue-tracking.xml +++ b/src/site/xdoc/issue-tracking.xml @@ -7,7 +7,7 @@ The ASF licenses this file to You under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at - http://www.apache.org/licenses/LICENSE-2.0 + https://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, @@ -26,7 +26,7 @@ limitations under the License. | commons-build-plugin/trunk/src/main/resources/commons-xdoc-templates | +======================================================================+ | | - | 1) Re-generate using: mvn commons:jira-page | + | 1) Re-generate using: mvn commons-build:jira-page | | | | 2) Set the following properties in the component's pom: | | - commons.jira.id (required, alphabetic, upper case) | @@ -41,10 +41,12 @@ limitations under the License. | | +======================================================================+ --> - + Apache Commons CSV Issue tracking - Apache Commons Documentation Team + Apache Commons Team @@ -64,6 +66,7 @@ limitations under the License.

If you would like to report a bug, or raise an enhancement request with Apache Commons CSV please do the following: +

Search existing open bugs. If you find your issue listed then please add a comment with your details.
Submit either a bug report or enhancement request.

Please also remember these points: +

the more information you provide, the better we can help you
test cases are vital, particularly for any proposed enhancements
the developers of Apache Commons CSV are all unpaid volunteers

- For more information on subversion and creating patches see the - Apache Contributors Guide. + For more information on creating patches see the + Apache Contributors Guide.

You may also find these links useful: +

diff --git a/src/site/xdoc/mail-lists.xml b/src/site/xdoc/mail-lists.xml index 8eb001ef51..345cef8996 100644 --- a/src/site/xdoc/mail-lists.xml +++ b/src/site/xdoc/mail-lists.xml @@ -7,7 +7,7 @@ The ASF licenses this file to You under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at - http://www.apache.org/licenses/LICENSE-2.0 + https://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, @@ -26,7 +26,7 @@ limitations under the License. | commons-build-plugin/trunk/src/main/resources/commons-xdoc-templates | +======================================================================+ | | - | 1) Re-generate using: mvn commons:mail-page | + | 1) Re-generate using: mvn commons-build:mail-page | | | | 2) Set the following properties in the component's pom: | | - commons.componentid (required, alphabetic, lower case) | @@ -39,29 +39,31 @@ limitations under the License. | | +======================================================================+ --> - + Apache Commons CSV Mailing Lists - Apache Commons Documentation Team + Apache Commons Team

Apache Commons CSV shares mailing lists with all the other - Commons Components. + Commons Components. To make it easier for people to only read messages related to components they are interested in, the convention in Commons is to prefix the subject line of messages with the component's name, for example: -

[csv] Problem with the ...

[csv] Problem with the ...

Questions related to the usage of Apache Commons CSV should be posted to the - User List. + User List.
- The Developer List + The Developer List is for questions and discussion related to the development of Apache Commons CSV.
Please do not cross-post; developers are also subscribed to the user list. @@ -70,8 +72,10 @@ limitations under the License. to subscribe.

- Note: please don't send patches or attachments to any of the mailing lists. + Note: please don't send patches or attachments to any of the mailing lists; + most of the lists are set up to drop attachments. Patches are best handled via the Issue Tracking system. + If you have a GitHub account, most components also accept PRs (pull requests). Otherwise, please upload the file to a public server and include the URL in the mail.

@@ -105,10 +109,11 @@ limitations under the License. Subscribe Unsubscribe Post - mail-archives.apache.org - markmail.org
- www.mail-archive.com
- news.gmane.org + + lists.apache.org + + + www.mail-archive.com @@ -123,10 +128,11 @@ limitations under the License. Subscribe Unsubscribe Post - mail-archives.apache.org - markmail.org
- www.mail-archive.com
- news.gmane.org + + lists.apache.org + + + www.mail-archive.com @@ -141,9 +147,11 @@ limitations under the License. Subscribe Unsubscribe read only - mail-archives.apache.org - markmail.org
- www.mail-archive.com + + lists.apache.org + + + www.mail-archive.com @@ -152,15 +160,17 @@ limitations under the License. Commons Commits List

- Only for e-mails automatically generated by the source control sytem. + Only for e-mails automatically generated by the source control system.

Subscribe Unsubscribe read only - mail-archives.apache.org - markmail.org
- www.mail-archive.com + + lists.apache.org + + + www.mail-archive.com @@ -191,11 +201,11 @@ limitations under the License. Subscribe Unsubscribe read only - mail-archives.apache.org - markmail.org
- old.nabble.com
- www.mail-archive.com
- news.gmane.org + + lists.apache.org + + + www.mail-archive.com diff --git a/src/site/xdoc/security.xml b/src/site/xdoc/security.xml new file mode 100644 index 0000000000..47edf5d116 --- /dev/null +++ b/src/site/xdoc/security.xml @@ -0,0 +1,56 @@ + + + + + Apache Commons Security Reports + Apache Commons Team + + +

+ For information about reporting or asking questions about security, please see + Apache Commons Security. +

This page lists all security vulnerabilities fixed in released versions of this component. +

Please note that binary patches are never provided. If you need to apply a source code patch, use the building instructions for the component version + that you are using. +

+ If you need help on building this component or other help on following the instructions to mitigate the known vulnerabilities listed here, please send + your questions to the public + user mailing list. +

If you have encountered an unlisted security vulnerability or other unexpected behavior that has security impact, or if the descriptions here are + incomplete, please report them privately to the Apache Security Team. Thank you. +

None.

+ For information about safe deserialization, please see Safe Deserialization. +

+ + \ No newline at end of file diff --git a/src/site/xdoc/user-guide.xml b/src/site/xdoc/user-guide.xml index 1f89ebe884..d5a1f26850 100644 --- a/src/site/xdoc/user-guide.xml +++ b/src/site/xdoc/user-guide.xml @@ -7,7 +7,7 @@ The ASF licenses this file to You under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at - http://www.apache.org/licenses/LICENSE-2.0 + https://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, @@ -21,152 +21,6 @@ limitations under the License. Apache Commons Documentation Team - - -

Apache Commons CSV User Guide

- - - - -

- - Parsing files with Apache Commons CSV is relatively straight forward. - The CSVFormat class provides some commonly used CSV variants: - -

EXCEL: The Microsoft Excel CSV format.
INFORMIX_UNLOAD: Informix UNLOAD format used by the UNLOAD TO file_name operation.
INFORMIX_UNLOAD_CSV: Informix CSV UNLOAD format used by the UNLOAD TO file_name operation (escaping is disabled.)
MYSQL: The Oracle MySQL CSV format.
RFC-4180: The RFC-4180 format defined by RFC-4180
TDF: A tab delimited format

- - -

To parse an Excel CSV file, write:

- Reader in = new FileReader("path/to/file.csv"); -Iterable<CSVRecord> records = CSVFormat.EXCEL.parse(in); -for (CSVRecord record : records) { - String lastName = record.get("Last Name"); - String firstName = record.get("First Name"); -} - - - -

- To handle files that start with a Byte Order Mark (BOM) like some Excel CSV files, you need an extra step to - deal with these optional bytes. - You can use the - - BOMInputStream - - class from - Apache Commons IO - for example: -

- final URL url = ...; -final Reader reader = new InputStreamReader(new BOMInputStream(url.openStream()), "UTF-8"); -final CSVParser parser = new CSVParser(reader, CSVFormat.EXCEL.withHeader()); -try { - for (final CSVRecord record : parser) { - final String string = record.get("SomeColumn"); - ... - } -} finally { - parser.close(); - reader.close(); -} - -

- You might find it handy to create something like this: -

- /** -* Creates a reader capable of handling BOMs. -*/ -public InputStreamReader newReader(final InputStream inputStream) { - return new InputStreamReader(new BOMInputStream(inputStream), StandardCharsets.UTF_8); -} - - -

- -

- - Apache Commons CSV provides several ways to access record values. - The simplest way is to access values by their index in the record. - However, columns in CSV files often have a name, for example: ID, CustomerNo, Birthday, etc. - The CSVFormat class provides an API for specifing these header names and CSVRecord on - the other hand has methods to access values by their corresponding header name. - - - To access a record value by index, no special configuration of the CSVFormat is necessary: - Reader in = new FileReader("path/to/file.csv"); -Iterable<CSVRecord> records = CSVFormat.RFC4180.parse(in); -for (CSVRecord record : records) { - String columnOne = record.get(0); - String columnTwo = record.get(1); -} - - - - Indices may not be the most intuitive way to access record values. For this reason it is possible to - assign names to each column in the file: - Reader in = new FileReader("path/to/file.csv"); -Iterable<CSVRecord> records = CSVFormat.RFC4180.withHeader("ID", "CustomerNo", "Name").parse(in); -for (CSVRecord record : records) { - String id = record.get("ID"); - String customerNo = record.get("CustomerNo"); - String name = record.get("Name"); -} - - Note that column values can still be accessed using their index. - - - Using String values all over the code to reference columns can be error prone. For this reason, - it is possible to define an enum to specify header names. Note that the enum constant names are - used to access column values. This may lead to enums constant names which do not follow the Java - coding standard of defining constants in upper case with underscores: - public enum Headers { - ID, CustomerNo, Name -} -Reader in = new FileReader("path/to/file.csv"); -Iterable<CSVRecord> records = CSVFormat.RFC4180.withHeader(Headers.class).parse(in); -for (CSVRecord record : records) { - String id = record.get(Headers.ID); - String customerNo = record.get(Headers.CustomerNo); - String name = record.get(Headers.Name); -} - - Again it is possible to access values by their index and by using a String (for example "CustomerNo"). - - - Some CSV files define header names in their first record. If configured, Apache Commons CSV can parse - the header names from the first record: - Reader in = new FileReader("path/to/file.csv"); -Iterable<CSVRecord> records = CSVFormat.RFC4180.withFirstRecordAsHeader().parse(in); -for (CSVRecord record : records) { - String id = record.get("ID"); - String customerNo = record.get("CustomerNo"); - String name = record.get("Name"); -} - - This will use the values from the first record as header names and skip the first record when iterating. - - -

- To print a CSV file with headers, you specify the headers in the format: -

- final Appendable out = ...; - final CSVPrinter printer = CSVFormat.DEFAULT.withHeader("H1", "H2").print(out) - -

- To print a CSV file with JDBC column labels, you specify the ResultSet in the format: -

- final ResultSet resultSet = ...; - final CSVPrinter printer = CSVFormat.DEFAULT.withHeader(resultSet).print(out) - - -

- +

The User Guide migrated to the Javadoc.

diff --git a/src/test/java/org/apache/commons/csv/AssertionsTest.java b/src/test/java/org/apache/commons/csv/AssertionsTest.java deleted file mode 100644 index ca2c8d080f..0000000000 --- a/src/test/java/org/apache/commons/csv/AssertionsTest.java +++ /dev/null @@ -1,36 +0,0 @@ -/* - * Licensed to the Apache Software Foundation (ASF) under one or more - * contributor license agreements. See the NOTICE file distributed with - * this work for additional information regarding copyright ownership. - * The ASF licenses this file to You under the Apache License, Version 2.0 - * (the "License"); you may not use this file except in compliance with - * the License. You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ - -package org.apache.commons.csv; - -import org.junit.Test; - -/** - * @version $Id$ - */ -public class AssertionsTest { - - @Test - public void testNotNull() throws Exception { - Assertions.notNull(new Object(), "object"); - } - - @Test(expected = IllegalArgumentException.class) - public void testNotNullNull() throws Exception { - Assertions.notNull(null, "object"); - } -} diff --git a/src/test/java/org/apache/commons/csv/CSVBenchmark.java b/src/test/java/org/apache/commons/csv/CSVBenchmark.java index b2ecd52842..b1be4ce095 100644 --- a/src/test/java/org/apache/commons/csv/CSVBenchmark.java +++ b/src/test/java/org/apache/commons/csv/CSVBenchmark.java @@ -1,33 +1,35 @@ /* - * Licensed to the Apache Software Foundation (ASF) under one or more - * contributor license agreements. See the NOTICE file distributed with - * this work for additional information regarding copyright ownership. - * The ASF licenses this file to You under the Apache License, Version 2.0 - * (the "License"); you may not use this file except in compliance with - * the License. You may obtain a copy of the License at + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at * - * http://www.apache.org/licenses/LICENSE-2.0 + * https://www.apache.org/licenses/LICENSE-2.0 * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. */ package org.apache.commons.csv; import java.io.BufferedReader; -import java.io.File; -import java.io.FileInputStream; import java.io.IOException; import java.io.InputStream; +import java.io.Reader; import java.io.StringReader; -import java.util.List; +import java.nio.charset.StandardCharsets; +import java.util.Iterator; +import java.util.Scanner; import java.util.concurrent.TimeUnit; import java.util.zip.GZIPInputStream; -import com.generationjava.io.CsvReader; import org.apache.commons.io.IOUtils; import org.apache.commons.lang3.StringUtils; import org.openjdk.jmh.annotations.Benchmark; @@ -45,6 +47,10 @@ import org.supercsv.io.CsvListReader; import org.supercsv.prefs.CsvPreference; +import com.generationjava.io.CsvReader; +import com.opencsv.CSVParserBuilder; +import com.opencsv.CSVReaderBuilder; + @BenchmarkMode(Mode.AverageTime) @Fork(value = 1, jvmArgs = {"-server", "-Xms1024M", "-Xmx1024M"}) @Threads(1) @@ -54,157 +60,169 @@ @State(Scope.Benchmark) public class CSVBenchmark { + private static final class CountingReaderCallback implements org.skife.csv.ReaderCallback { + public int count; + + @Override + public void onRow(final String[] fields) { + count++; + } + } + private String data; + private Reader getReader() { + return new StringReader(data); + } + /** * Load the data in memory before running the benchmarks, this takes out IO from the results. */ @Setup public void init() throws IOException { - final File file = new File("src/test/resources/perf/worldcitiespop.txt.gz"); - final InputStream in = new GZIPInputStream(new FileInputStream(file)); - this.data = IOUtils.toString(in, "ISO-8859-1"); - in.close(); - } - - private BufferedReader getReader() throws IOException { - return new BufferedReader(new StringReader(data)); - } - - @Benchmark - public int read(final Blackhole bh) throws Exception { - final BufferedReader in = getReader(); - int count = 0; - String line; - while ((line = in.readLine()) != null) { - count++; + try (InputStream in = this.getClass().getClassLoader().getResourceAsStream("org/apache/commons/csv/perf/worldcitiespop.txt.gz"); + InputStream gzin = new GZIPInputStream(in, 8192)) { + this.data = IOUtils.toString(gzin, StandardCharsets.ISO_8859_1); } - - bh.consume(count); - in.close(); - return count; - } - - @Benchmark - public int split(final Blackhole bh) throws Exception { - final BufferedReader in = getReader(); - int count = 0; - String line; - while ((line = in.readLine()) != null) { - final String[] values = StringUtils.split(line, ','); - count += values.length; - } - - bh.consume(count); - in.close(); - return count; } @Benchmark public int parseCommonsCSV(final Blackhole bh) throws Exception { - final BufferedReader in = getReader(); - - final CSVFormat format = CSVFormat.DEFAULT.withHeader(); - int count = 0; - for (final CSVRecord record : format.parse(in)) { - count++; + + try (Reader in = getReader()) { + final CSVFormat format = CSVFormat.Builder.create().setSkipHeaderRecord(true).build(); + final Iterator iter = format.parse(in).iterator(); + while (iter.hasNext()) { + count++; + iter.next(); + } } bh.consume(count); - in.close(); return count; } @Benchmark public int parseGenJavaCSV(final Blackhole bh) throws Exception { - final BufferedReader in = getReader(); - - final CsvReader reader = new CsvReader(in); - reader.setFieldDelimiter(','); - int count = 0; - String[] record = null; - while ((record = reader.readLine()) != null) { - count++; + + try (Reader in = getReader()) { + final CsvReader reader = new CsvReader(in); + reader.setFieldDelimiter(','); + while (reader.readLine() != null) { + count++; + } } bh.consume(count); - in.close(); return count; } @Benchmark public int parseJavaCSV(final Blackhole bh) throws Exception { - final BufferedReader in = getReader(); - - final com.csvreader.CsvReader reader = new com.csvreader.CsvReader(in, ','); - reader.setRecordDelimiter('\n'); - int count = 0; - while (reader.readRecord()) { - count++; + + try (Reader in = getReader()) { + final com.csvreader.CsvReader reader = new com.csvreader.CsvReader(in, ','); + reader.setRecordDelimiter('\n'); + while (reader.readRecord()) { + count++; + } } bh.consume(count); - in.close(); return count; } @Benchmark public int parseOpenCSV(final Blackhole bh) throws Exception { - final BufferedReader in = getReader(); - - final com.opencsv.CSVReader reader = new com.opencsv.CSVReader(in, ','); - int count = 0; - while (reader.readNext() != null) { - count++; + + final com.opencsv.CSVParser parser = new CSVParserBuilder() + .withSeparator(',').withIgnoreQuotations(true).build(); + + try (Reader in = getReader()) { + final com.opencsv.CSVReader reader = new CSVReaderBuilder(in).withSkipLines(1).withCSVParser(parser).build(); + while (reader.readNext() != null) { + count++; + } } bh.consume(count); - in.close(); return count; } @Benchmark public int parseSkifeCSV(final Blackhole bh) throws Exception { - final BufferedReader in = getReader(); - final org.skife.csv.CSVReader reader = new org.skife.csv.SimpleReader(); reader.setSeperator(','); - final CountingReaderCallback callback = new CountingReaderCallback(); - reader.parse(in, callback); + + try (Reader in = getReader()) { + reader.parse(in, callback); + } bh.consume(callback); - in.close(); return callback.count; } - private static class CountingReaderCallback implements org.skife.csv.ReaderCallback { - public int count = 0; + @Benchmark + public int parseSuperCSV(final Blackhole bh) throws Exception { + int count = 0; - @Override - public void onRow(final String[] fields) { - count++; + try (CsvListReader reader = new CsvListReader(getReader(), CsvPreference.STANDARD_PREFERENCE)) { + while (reader.read() != null) { + count++; + } } + + bh.consume(count); + return count; } @Benchmark - public int parseSuperCSV(final Blackhole bh) throws Exception { - final BufferedReader in = getReader(); - - final CsvListReader reader = new CsvListReader(in, CsvPreference.STANDARD_PREFERENCE); + public int read(final Blackhole bh) throws Exception { + int count = 0; + + try (BufferedReader reader = new BufferedReader(getReader())) { + while (reader.readLine() != null) { + count++; + } + } + + bh.consume(count); + return count; + } + @Benchmark + public int scan(final Blackhole bh) throws Exception { int count = 0; - List record = null; - while ((record = reader.read()) != null) { - count++; + + try (Scanner scanner = new Scanner(getReader())) { + while (scanner.hasNextLine()) { + scanner.nextLine(); + count++; + } } bh.consume(count); - in.close(); return count; } + + @Benchmark + public int split(final Blackhole bh) throws Exception { + int count = 0; + + try (BufferedReader reader = new BufferedReader(getReader())) { + String line; + while ((line = reader.readLine()) != null) { + final String[] values = StringUtils.split(line, ','); + count += values.length; + } + } + + bh.consume(count); + return count; + } } diff --git a/src/test/java/org/apache/commons/csv/CSVDuplicateHeaderTest.java b/src/test/java/org/apache/commons/csv/CSVDuplicateHeaderTest.java new file mode 100644 index 0000000000..2f518a1206 --- /dev/null +++ b/src/test/java/org/apache/commons/csv/CSVDuplicateHeaderTest.java @@ -0,0 +1,339 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * https://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.commons.csv; + +import static org.junit.jupiter.api.Assertions.assertArrayEquals; +import static org.junit.jupiter.api.Assertions.assertEquals; +import static org.junit.jupiter.api.Assertions.assertThrows; + +import java.io.IOException; +import java.util.Arrays; +import java.util.List; +import java.util.stream.Collectors; +import java.util.stream.Stream; + +import org.junit.jupiter.params.ParameterizedTest; +import org.junit.jupiter.params.provider.Arguments; +import org.junit.jupiter.params.provider.MethodSource; + +/** + * Tests parsing of duplicate column names in a CSV header. + * The test verifies that headers are consistently handled by CSVFormat and CSVParser. + */ +class CSVDuplicateHeaderTest { + + /** + * Return test cases for duplicate header data for use in CSVFormat. + *

+ * This filters the parsing test data to all cases where the allow missing column + * names flag is true and ignore header case is false: these flags are exclusively for parsing. + * CSVFormat validation applies to both parsing and writing and thus validation + * is less strict and behaves as if the allow missing column names constraint and + * the ignore header case behavior are absent. + * The filtered data is then returned with the parser flags set to both true and false + * for each test case. + *

+ * + * @return the stream of arguments + */ + static Stream duplicateHeaderAllowsMissingColumnsNamesData() { + return duplicateHeaderData() + .filter(arg -> Boolean.TRUE.equals(arg.get()[1]) && Boolean.FALSE.equals(arg.get()[2])) + .flatMap(arg -> { + // Return test case with flags as all true/false combinations + final Object[][] data = new Object[4][]; + final Boolean[] flags = {Boolean.TRUE, Boolean.FALSE}; + int i = 0; + for (final Boolean a : flags) { + for (final Boolean b : flags) { + data[i] = arg.get().clone(); + data[i][1] = a; + data[i][2] = b; + i++; + } + } + return Arrays.stream(data).map(Arguments::of); + }); + } + + /** + * Return test cases for duplicate header data for use in parsing (CSVParser). Uses the order: + *

+     * DuplicateHeaderMode duplicateHeaderMode
+     * boolean allowMissingColumnNames
+     * String[] headers
+     * boolean valid
+     *

+ * + * @return the stream of arguments + */ + static Stream duplicateHeaderData() { + return Stream.of( + // Any combination with a valid header + Arguments.of(DuplicateHeaderMode.DISALLOW, false, false, new String[] {"A", "B"}, true), + Arguments.of(DuplicateHeaderMode.ALLOW_EMPTY, false, false, new String[] {"A", "B"}, true), + Arguments.of(DuplicateHeaderMode.ALLOW_ALL, false, false, new String[] {"A", "B"}, true), + Arguments.of(DuplicateHeaderMode.DISALLOW, true, false, new String[] {"A", "B"}, true), + Arguments.of(DuplicateHeaderMode.ALLOW_EMPTY, true, false, new String[] {"A", "B"}, true), + Arguments.of(DuplicateHeaderMode.ALLOW_ALL, true, false, new String[] {"A", "B"}, true), + + // Any combination with a valid header including empty + Arguments.of(DuplicateHeaderMode.DISALLOW, false, false, new String[] {"A", ""}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_EMPTY, false, false, new String[] {"A", ""}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_ALL, false, false, new String[] {"A", ""}, false), + Arguments.of(DuplicateHeaderMode.DISALLOW, true, false, new String[] {"A", ""}, true), + Arguments.of(DuplicateHeaderMode.ALLOW_EMPTY, true, false, new String[] {"A", ""}, true), + Arguments.of(DuplicateHeaderMode.ALLOW_ALL, true, false, new String[] {"A", ""}, true), + + // Any combination with a valid header including blank (1 space) + Arguments.of(DuplicateHeaderMode.DISALLOW, false, false, new String[] {"A", " "}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_EMPTY, false, false, new String[] {"A", " "}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_ALL, false, false, new String[] {"A", " "}, false), + Arguments.of(DuplicateHeaderMode.DISALLOW, true, false, new String[] {"A", " "}, true), + Arguments.of(DuplicateHeaderMode.ALLOW_EMPTY, true, false, new String[] {"A", " "}, true), + Arguments.of(DuplicateHeaderMode.ALLOW_ALL, true, false, new String[] {"A", " "}, true), + + // Any combination with a valid header including null + Arguments.of(DuplicateHeaderMode.DISALLOW, false, false, new String[] {"A", null}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_EMPTY, false, false, new String[] {"A", null}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_ALL, false, false, new String[] {"A", null}, false), + Arguments.of(DuplicateHeaderMode.DISALLOW, true, false, new String[] {"A", null}, true), + Arguments.of(DuplicateHeaderMode.ALLOW_EMPTY, true, false, new String[] {"A", null}, true), + Arguments.of(DuplicateHeaderMode.ALLOW_ALL, true, false, new String[] {"A", null}, true), + + // Duplicate non-empty names + Arguments.of(DuplicateHeaderMode.DISALLOW, false, false, new String[] {"A", "A"}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_EMPTY, false, false, new String[] {"A", "A"}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_ALL, false, false, new String[] {"A", "A"}, true), + Arguments.of(DuplicateHeaderMode.DISALLOW, true, false, new String[] {"A", "A"}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_EMPTY, true, false, new String[] {"A", "A"}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_ALL, true, false, new String[] {"A", "A"}, true), + + // Duplicate empty names + Arguments.of(DuplicateHeaderMode.DISALLOW, false, false, new String[] {"", ""}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_EMPTY, false, false, new String[] {"", ""}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_ALL, false, false, new String[] {"", ""}, false), + Arguments.of(DuplicateHeaderMode.DISALLOW, true, false, new String[] {"", ""}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_EMPTY, true, false, new String[] {"", ""}, true), + Arguments.of(DuplicateHeaderMode.ALLOW_ALL, true, false, new String[] {"", ""}, true), + + // Duplicate blank names (1 space) + Arguments.of(DuplicateHeaderMode.DISALLOW, false, false, new String[] {" ", " "}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_EMPTY, false, false, new String[] {" ", " "}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_ALL, false, false, new String[] {" ", " "}, false), + Arguments.of(DuplicateHeaderMode.DISALLOW, true, false, new String[] {" ", " "}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_EMPTY, true, false, new String[] {" ", " "}, true), + Arguments.of(DuplicateHeaderMode.ALLOW_ALL, true, false, new String[] {" ", " "}, true), + + // Duplicate blank names (3 spaces) + Arguments.of(DuplicateHeaderMode.DISALLOW, false, false, new String[] {" ", " "}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_EMPTY, false, false, new String[] {" ", " "}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_ALL, false, false, new String[] {" ", " "}, false), + Arguments.of(DuplicateHeaderMode.DISALLOW, true, false, new String[] {" ", " "}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_EMPTY, true, false, new String[] {" ", " "}, true), + Arguments.of(DuplicateHeaderMode.ALLOW_ALL, true, false, new String[] {" ", " "}, true), + + // Duplicate null names + Arguments.of(DuplicateHeaderMode.DISALLOW, false, false, new String[] {null, null}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_EMPTY, false, false, new String[] {null, null}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_ALL, false, false, new String[] {null, null}, false), + Arguments.of(DuplicateHeaderMode.DISALLOW, true, false, new String[] {null, null}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_EMPTY, true, false, new String[] {null, null}, true), + Arguments.of(DuplicateHeaderMode.ALLOW_ALL, true, false, new String[] {null, null}, true), + + // Duplicate blank names (1+3 spaces) + Arguments.of(DuplicateHeaderMode.DISALLOW, false, false, new String[] {" ", " "}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_EMPTY, false, false, new String[] {" ", " "}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_ALL, false, false, new String[] {" ", " "}, false), + Arguments.of(DuplicateHeaderMode.DISALLOW, true, false, new String[] {" ", " "}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_EMPTY, true, false, new String[] {" ", " "}, true), + Arguments.of(DuplicateHeaderMode.ALLOW_ALL, true, false, new String[] {" ", " "}, true), + + // Duplicate blank names and null names + Arguments.of(DuplicateHeaderMode.DISALLOW, false, false, new String[] {" ", null}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_EMPTY, false, false, new String[] {" ", null}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_ALL, false, false, new String[] {" ", null}, false), + Arguments.of(DuplicateHeaderMode.DISALLOW, true, false, new String[] {" ", null}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_EMPTY, true, false, new String[] {" ", null}, true), + Arguments.of(DuplicateHeaderMode.ALLOW_ALL, true, false, new String[] {" ", null}, true), + + // Duplicate non-empty and empty names + Arguments.of(DuplicateHeaderMode.DISALLOW, false, false, new String[] {"A", "A", "", ""}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_EMPTY, false, false, new String[] {"A", "A", "", ""}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_ALL, false, false, new String[] {"A", "A", "", ""}, false), + Arguments.of(DuplicateHeaderMode.DISALLOW, true, false, new String[] {"A", "A", "", ""}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_EMPTY, true, false, new String[] {"A", "A", "", ""}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_ALL, true, false, new String[] {"A", "A", "", ""}, true), + + // Non-duplicate non-empty and duplicate empty names + Arguments.of(DuplicateHeaderMode.DISALLOW, false, false, new String[] {"A", "B", "", ""}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_EMPTY, false, false, new String[] {"A", "B", "", ""}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_ALL, false, false, new String[] {"A", "B", "", ""}, false), + Arguments.of(DuplicateHeaderMode.DISALLOW, true, false, new String[] {"A", "B", "", ""}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_EMPTY, true, false, new String[] {"A", "B", "", ""}, true), + Arguments.of(DuplicateHeaderMode.ALLOW_ALL, true, false, new String[] {"A", "B", "", ""}, true), + + // Duplicate non-empty and blank names + Arguments.of(DuplicateHeaderMode.DISALLOW, false, false, new String[] {"A", "A", " ", " "}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_EMPTY, false, false, new String[] {"A", "A", " ", " "}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_ALL, false, false, new String[] {"A", "A", " ", " "}, false), + Arguments.of(DuplicateHeaderMode.DISALLOW, true, false, new String[] {"A", "A", " ", " "}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_EMPTY, true, false, new String[] {"A", "A", " ", " "}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_ALL, true, false, new String[] {"A", "A", " ", " "}, true), + + // Duplicate non-empty and null names + Arguments.of(DuplicateHeaderMode.DISALLOW, false, false, new String[] {"A", "A", null, null}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_EMPTY, false, false, new String[] {"A", "A", null, null}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_ALL, false, false, new String[] {"A", "A", null, null}, false), + Arguments.of(DuplicateHeaderMode.DISALLOW, true, false, new String[] {"A", "A", null, null}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_EMPTY, true, false, new String[] {"A", "A", null, null}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_ALL, true, false, new String[] {"A", "A", null, null}, true), + + // Duplicate blank names + Arguments.of(DuplicateHeaderMode.DISALLOW, false, false, new String[] {"A", "", ""}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_EMPTY, false, false, new String[] {"A", "", ""}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_ALL, false, false, new String[] {"A", "", ""}, false), + Arguments.of(DuplicateHeaderMode.DISALLOW, true, false, new String[] {"A", "", ""}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_EMPTY, true, false, new String[] {"A", "", ""}, true), + Arguments.of(DuplicateHeaderMode.ALLOW_ALL, true, false, new String[] {"A", "", ""}, true), + + // Duplicate null names + Arguments.of(DuplicateHeaderMode.DISALLOW, false, false, new String[] {"A", null, null}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_EMPTY, false, false, new String[] {"A", null, null}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_ALL, false, false, new String[] {"A", null, null}, false), + Arguments.of(DuplicateHeaderMode.DISALLOW, true, false, new String[] {"A", null, null}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_EMPTY, true, false, new String[] {"A", null, null}, true), + Arguments.of(DuplicateHeaderMode.ALLOW_ALL, true, false, new String[] {"A", null, null}, true), + + // Duplicate blank names (1+3 spaces) + Arguments.of(DuplicateHeaderMode.DISALLOW, false, false, new String[] {"A", " ", " "}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_EMPTY, false, false, new String[] {"A", " ", " "}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_ALL, false, false, new String[] {"A", " ", " "}, false), + Arguments.of(DuplicateHeaderMode.DISALLOW, true, false, new String[] {"A", " ", " "}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_EMPTY, true, false, new String[] {"A", " ", " "}, true), + Arguments.of(DuplicateHeaderMode.ALLOW_ALL, true, false, new String[] {"A", " ", " "}, true), + + // Duplicate names (case insensitive) + Arguments.of(DuplicateHeaderMode.DISALLOW, false, true , new String[] {"A", "a"}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_EMPTY, false, true , new String[] {"A", "a"}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_ALL, false, true , new String[] {"A", "a"}, true), + Arguments.of(DuplicateHeaderMode.DISALLOW, true, true , new String[] {"A", "a"}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_EMPTY, true, true , new String[] {"A", "a"}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_ALL, true, true , new String[] {"A", "a"}, true), + + // Duplicate non-empty (case insensitive) and empty names + Arguments.of(DuplicateHeaderMode.DISALLOW, false, true, new String[] {"A", "a", "", ""}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_EMPTY, false, true, new String[] {"A", "a", "", ""}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_ALL, false, true, new String[] {"A", "a", "", ""}, false), + Arguments.of(DuplicateHeaderMode.DISALLOW, true, true, new String[] {"A", "a", "", ""}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_EMPTY, true, true, new String[] {"A", "a", "", ""}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_ALL, true, true, new String[] {"A", "a", "", ""}, true), + + // Duplicate non-empty (case insensitive) and blank names + Arguments.of(DuplicateHeaderMode.DISALLOW, false, true, new String[] {"A", "a", " ", " "}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_EMPTY, false, true, new String[] {"A", "a", " ", " "}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_ALL, false, true, new String[] {"A", "a", " ", " "}, false), + Arguments.of(DuplicateHeaderMode.DISALLOW, true, true, new String[] {"A", "a", " ", " "}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_EMPTY, true, true, new String[] {"A", "a", " ", " "}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_ALL, true, true, new String[] {"A", "a", " ", " "}, true), + + // Duplicate non-empty (case insensitive) and null names + Arguments.of(DuplicateHeaderMode.DISALLOW, false, true, new String[] {"A", "a", null, null}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_EMPTY, false, true, new String[] {"A", "a", null, null}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_ALL, false, true, new String[] {"A", "a", null, null}, false), + Arguments.of(DuplicateHeaderMode.DISALLOW, true, true, new String[] {"A", "a", null, null}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_EMPTY, true, true, new String[] {"A", "a", null, null}, false), + Arguments.of(DuplicateHeaderMode.ALLOW_ALL, true, true, new String[] {"A", "a", null, null}, true) + ); + } + + /** + * Tests duplicate headers with the CSVFormat. + * + * @param duplicateHeaderMode the duplicate header mode + * @param allowMissingColumnNames the allow missing column names flag (only used for parsing) + * @param ignoreHeaderCase the ignore header case flag (only used for parsing) + * @param headers the headers + * @param valid true if the settings are expected to be valid, otherwise expect a IllegalArgumentException + */ + @ParameterizedTest + @MethodSource(value = {"duplicateHeaderAllowsMissingColumnsNamesData"}) + void testCSVFormat(final DuplicateHeaderMode duplicateHeaderMode, + final boolean allowMissingColumnNames, + final boolean ignoreHeaderCase, + final String[] headers, + final boolean valid) { + final CSVFormat.Builder builder = + CSVFormat.DEFAULT.builder() + .setDuplicateHeaderMode(duplicateHeaderMode) + .setAllowMissingColumnNames(allowMissingColumnNames) + .setIgnoreHeaderCase(ignoreHeaderCase) + .setHeader(headers); + if (valid) { + final CSVFormat format = builder.get(); + assertEquals(duplicateHeaderMode, format.getDuplicateHeaderMode(), "DuplicateHeaderMode"); + assertEquals(allowMissingColumnNames, format.getAllowMissingColumnNames(), "AllowMissingColumnNames"); + assertArrayEquals(headers, format.getHeader(), "Header"); + } else { + assertThrows(IllegalArgumentException.class, builder::get); + } + } + + /** + * Tests duplicate headers with the CSVParser. + * + * @param duplicateHeaderMode the duplicate header mode + * @param allowMissingColumnNames the allow missing column names flag (only used for parsing) + * @param ignoreHeaderCase the ignore header case flag (only used for parsing) + * @param headers the headers (joined with the CSVFormat delimiter to create a string input) + * @param valid true if the settings are expected to be valid, otherwise expect a IllegalArgumentException + * @throws IOException Signals that an I/O exception has occurred. + */ + @ParameterizedTest + @MethodSource(value = {"duplicateHeaderData"}) + void testCSVParser(final DuplicateHeaderMode duplicateHeaderMode, + final boolean allowMissingColumnNames, + final boolean ignoreHeaderCase, + final String[] headers, + final boolean valid) throws IOException { + // @formatter:off + final CSVFormat format = CSVFormat.DEFAULT.builder() + .setDuplicateHeaderMode(duplicateHeaderMode) + .setAllowMissingColumnNames(allowMissingColumnNames) + .setIgnoreHeaderCase(ignoreHeaderCase) + .setNullString("NULL") + .setHeader() + .get(); + // @formatter:on + final String input = Arrays.stream(headers) + .map(s -> s == null ? format.getNullString() : s) + .collect(Collectors.joining(format.getDelimiterString())); + // @formatter:off + if (valid) { + try (CSVParser parser = CSVParser.parse(input, format)) { + // Parser ignores null headers + final List expected = Arrays.stream(headers).filter(s -> s != null).collect(Collectors.toList()); + assertEquals(expected, parser.getHeaderNames(), "HeaderNames"); + } + } else { + assertThrows(IllegalArgumentException.class, () -> CSVParser.parse(input, format)); + } + } +} diff --git a/src/test/java/org/apache/commons/csv/CSVFileParserTest.java b/src/test/java/org/apache/commons/csv/CSVFileParserTest.java index 50b78437f6..e74d0e6884 100644 --- a/src/test/java/org/apache/commons/csv/CSVFileParserTest.java +++ b/src/test/java/org/apache/commons/csv/CSVFileParserTest.java @@ -1,176 +1,148 @@ /* - * Licensed to the Apache Software Foundation (ASF) under one or more - * contributor license agreements. See the NOTICE file distributed with - * this work for additional information regarding copyright ownership. - * The ASF licenses this file to You under the Apache License, Version 2.0 - * (the "License"); you may not use this file except in compliance with - * the License. You may obtain a copy of the License at + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at * - * http://www.apache.org/licenses/LICENSE-2.0 + * https://www.apache.org/licenses/LICENSE-2.0 * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. */ package org.apache.commons.csv; -import static org.junit.Assert.assertEquals; -import static org.junit.Assert.assertNotNull; -import static org.junit.Assert.assertTrue; -import static org.junit.Assert.fail; +import static org.junit.jupiter.api.Assertions.assertEquals; +import static org.junit.jupiter.api.Assertions.assertNotNull; +import static org.junit.jupiter.api.Assertions.assertTrue; +import static org.junit.jupiter.api.Assertions.fail; import java.io.BufferedReader; import java.io.File; -import java.io.FileNotFoundException; import java.io.FileReader; -import java.io.FilenameFilter; import java.io.IOException; import java.net.URL; import java.nio.charset.Charset; -import java.util.ArrayList; +import java.nio.charset.StandardCharsets; import java.util.Arrays; -import java.util.Collection; -import java.util.List; +import java.util.stream.Stream; -import org.junit.Test; -import org.junit.runner.RunWith; -import org.junit.runners.Parameterized; -import org.junit.runners.Parameterized.Parameters; +import org.junit.jupiter.params.ParameterizedTest; +import org.junit.jupiter.params.provider.MethodSource; /** * Parse tests using test files - * - * @version $Id$ */ -@RunWith(Parameterized.class) -public class CSVFileParserTest { - - private static final File BASE = new File("src/test/resources/CSVFileParser"); +class CSVFileParserTest { - private final BufferedReader testData; + private static final File BASE_DIR = new File("src/test/resources/org/apache/commons/csv/CSVFileParser"); - private final String testName; - - public CSVFileParserTest(final File file) throws FileNotFoundException { - this.testName = file.getName(); - this.testData = new BufferedReader(new FileReader(file)); + public static Stream generateData() { + final File[] files = BASE_DIR.listFiles((dir, name) -> name.startsWith("test") && name.endsWith(".txt")); + return files != null ? Stream.of(files) : Stream.empty(); } - private String readTestData() throws IOException { + private String readTestData(final BufferedReader reader) throws IOException { String line; do { - line = testData.readLine(); + line = reader.readLine(); } while (line != null && line.startsWith("#")); return line; } - @Parameters - public static Collection generateData() { - final List list = new ArrayList<>(); - - final FilenameFilter filenameFilter = new FilenameFilter() { - - @Override - public boolean accept(final File dir, final String name) { - return name.startsWith("test") && name.endsWith(".txt"); - } - }; - final File[] files = BASE.listFiles(filenameFilter); - if (files != null) { - for (final File f : files) { - list.add(new Object[] { f }); - } - } - return list; - } - - @Test - public void testCSVFile() throws Exception { - String line = readTestData(); - assertNotNull("file must contain config line", line); - final String[] split = line.split(" "); - assertTrue(testName + " require 1 param", split.length >= 1); - // first line starts with csv data file name - CSVFormat format = CSVFormat.newFormat(',').withQuote('"'); - boolean checkComments = false; - for (int i = 1; i < split.length; i++) { - final String option = split[i]; - final String[] option_parts = option.split("=", 2); - if ("IgnoreEmpty".equalsIgnoreCase(option_parts[0])) { - format = format.withIgnoreEmptyLines(Boolean.parseBoolean(option_parts[1])); - } else if ("IgnoreSpaces".equalsIgnoreCase(option_parts[0])) { - format = format.withIgnoreSurroundingSpaces(Boolean.parseBoolean(option_parts[1])); - } else if ("CommentStart".equalsIgnoreCase(option_parts[0])) { - format = format.withCommentMarker(option_parts[1].charAt(0)); - } else if ("CheckComments".equalsIgnoreCase(option_parts[0])) { - checkComments = true; - } else { - fail(testName + " unexpected option: " + option); + @ParameterizedTest + @MethodSource("generateData") + void testCSVFile(final File testFile) throws Exception { + try (FileReader fr = new FileReader(testFile); BufferedReader testDataReader = new BufferedReader(fr)) { + String line = readTestData(testDataReader); + assertNotNull("file must contain config line", line); + final String[] split = line.split(" "); + assertTrue(split.length >= 1, testFile.getName() + " require 1 param"); + // first line starts with csv data file name + CSVFormat format = CSVFormat.newFormat(',').withQuote('"'); + boolean checkComments = false; + for (int i = 1; i < split.length; i++) { + final String option = split[i]; + final String[] optionParts = option.split("=", 2); + if ("IgnoreEmpty".equalsIgnoreCase(optionParts[0])) { + format = format.withIgnoreEmptyLines(Boolean.parseBoolean(optionParts[1])); + } else if ("IgnoreSpaces".equalsIgnoreCase(optionParts[0])) { + format = format.withIgnoreSurroundingSpaces(Boolean.parseBoolean(optionParts[1])); + } else if ("CommentStart".equalsIgnoreCase(optionParts[0])) { + format = format.withCommentMarker(optionParts[1].charAt(0)); + } else if ("CheckComments".equalsIgnoreCase(optionParts[0])) { + checkComments = true; + } else { + fail(testFile.getName() + " unexpected option: " + option); + } } - } - line = readTestData(); // get string version of format - assertEquals(testName + " Expected format ", line, format.toString()); - - // Now parse the file and compare against the expected results - // We use a buffered reader internally so no need to create one here. - try (final CSVParser parser = CSVParser.parse(new File(BASE, split[0]), Charset.defaultCharset(), format)) { - for (final CSVRecord record : parser) { - String parsed = Arrays.toString(record.values()); - if (checkComments) { - final String comment = record.getComment().replace("\n", "\\n"); - if (comment != null) { - parsed += "#" + comment; + line = readTestData(testDataReader); // get string version of format + assertEquals(line, format.toString(), testFile.getName() + " Expected format "); + + // Now parse the file and compare against the expected results + // We use a buffered reader internally so no need to create one here. + try (CSVParser parser = CSVParser.parse(new File(BASE_DIR, split[0]), Charset.defaultCharset(), format)) { + for (final CSVRecord record : parser) { + String parsed = Arrays.toString(record.values()); + final String comment = record.getComment(); + if (checkComments && comment != null) { + parsed += "#" + comment.replace("\n", "\\n"); } + final int count = record.size(); + assertEquals(readTestData(testDataReader), count + ":" + parsed, testFile.getName()); } - final int count = record.size(); - assertEquals(testName, readTestData(), count + ":" + parsed); } } } - @Test - public void testCSVUrl() throws Exception { - String line = readTestData(); - assertNotNull("file must contain config line", line); - final String[] split = line.split(" "); - assertTrue(testName + " require 1 param", split.length >= 1); - // first line starts with csv data file name - CSVFormat format = CSVFormat.newFormat(',').withQuote('"'); - boolean checkComments = false; - for (int i = 1; i < split.length; i++) { - final String option = split[i]; - final String[] option_parts = option.split("=", 2); - if ("IgnoreEmpty".equalsIgnoreCase(option_parts[0])) { - format = format.withIgnoreEmptyLines(Boolean.parseBoolean(option_parts[1])); - } else if ("IgnoreSpaces".equalsIgnoreCase(option_parts[0])) { - format = format.withIgnoreSurroundingSpaces(Boolean.parseBoolean(option_parts[1])); - } else if ("CommentStart".equalsIgnoreCase(option_parts[0])) { - format = format.withCommentMarker(option_parts[1].charAt(0)); - } else if ("CheckComments".equalsIgnoreCase(option_parts[0])) { - checkComments = true; - } else { - fail(testName + " unexpected option: " + option); + @ParameterizedTest + @MethodSource("generateData") + void testCSVUrl(final File testFile) throws Exception { + try (FileReader fr = new FileReader(testFile); BufferedReader testData = new BufferedReader(fr)) { + String line = readTestData(testData); + assertNotNull("file must contain config line", line); + final String[] split = line.split(" "); + assertTrue(split.length >= 1, testFile.getName() + " require 1 param"); + // first line starts with csv data file name + CSVFormat format = CSVFormat.newFormat(',').withQuote('"'); + boolean checkComments = false; + for (int i = 1; i < split.length; i++) { + final String option = split[i]; + final String[] optionParts = option.split("=", 2); + if ("IgnoreEmpty".equalsIgnoreCase(optionParts[0])) { + format = format.withIgnoreEmptyLines(Boolean.parseBoolean(optionParts[1])); + } else if ("IgnoreSpaces".equalsIgnoreCase(optionParts[0])) { + format = format.withIgnoreSurroundingSpaces(Boolean.parseBoolean(optionParts[1])); + } else if ("CommentStart".equalsIgnoreCase(optionParts[0])) { + format = format.withCommentMarker(optionParts[1].charAt(0)); + } else if ("CheckComments".equalsIgnoreCase(optionParts[0])) { + checkComments = true; + } else { + fail(testFile.getName() + " unexpected option: " + option); + } } - } - line = readTestData(); // get string version of format - assertEquals(testName + " Expected format ", line, format.toString()); - - // Now parse the file and compare against the expected results - final URL resource = ClassLoader.getSystemResource("CSVFileParser/" + split[0]); - try (final CSVParser parser = CSVParser.parse(resource, Charset.forName("UTF-8"), format)) { - for (final CSVRecord record : parser) { - String parsed = Arrays.toString(record.values()); - if (checkComments) { - final String comment = record.getComment().replace("\n", "\\n"); - if (comment != null) { - parsed += "#" + comment; + line = readTestData(testData); // get string version of format + assertEquals(line, format.toString(), testFile.getName() + " Expected format "); + + // Now parse the file and compare against the expected results + final URL resource = ClassLoader.getSystemResource("org/apache/commons/csv/CSVFileParser/" + split[0]); + try (CSVParser parser = CSVParser.parse(resource, StandardCharsets.UTF_8, format)) { + for (final CSVRecord record : parser) { + String parsed = Arrays.toString(record.values()); + final String comment = record.getComment(); + if (checkComments && comment != null) { + parsed += "#" + comment.replace("\n", "\\n"); } + final int count = record.size(); + assertEquals(readTestData(testData), count + ":" + parsed, testFile.getName()); } - final int count = record.size(); - assertEquals(testName, readTestData(), count + ":" + parsed); } } } diff --git a/src/test/java/org/apache/commons/csv/CSVFormatPredefinedTest.java b/src/test/java/org/apache/commons/csv/CSVFormatPredefinedTest.java index 1340534ac7..dad08cdb1d 100644 --- a/src/test/java/org/apache/commons/csv/CSVFormatPredefinedTest.java +++ b/src/test/java/org/apache/commons/csv/CSVFormatPredefinedTest.java @@ -1,57 +1,85 @@ -/* - * Licensed to the Apache Software Foundation (ASF) under one or more - * contributor license agreements. See the NOTICE file distributed with - * this work for additional information regarding copyright ownership. - * The ASF licenses this file to You under the Apache License, Version 2.0 - * (the "License"); you may not use this file except in compliance with - * the License. You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ - -package org.apache.commons.csv; - -import org.junit.Assert; -import org.junit.Test; - -/** - * Tests {@link CSVFormat.Predefined}. - */ -public class CSVFormatPredefinedTest { - - private void test(final CSVFormat format, final String enumName) { - Assert.assertEquals(format, CSVFormat.Predefined.valueOf(enumName).getFormat()); - Assert.assertEquals(format, CSVFormat.valueOf(enumName)); - } - - @Test - public void testDefault() { - test(CSVFormat.DEFAULT, "Default"); - } - - @Test - public void testExcel() { - test(CSVFormat.EXCEL, "Excel"); - } - - @Test - public void testMySQL() { - test(CSVFormat.MYSQL, "MySQL"); - } - - @Test - public void testRFC4180() { - test(CSVFormat.RFC4180, "RFC4180"); - } - - @Test - public void testTDF() { - test(CSVFormat.TDF, "TDF"); - } -} +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * https://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.commons.csv; + +import static org.junit.jupiter.api.Assertions.assertEquals; + +import org.junit.jupiter.api.Test; + +/** + * Tests {@link CSVFormat.Predefined}. + */ +class CSVFormatPredefinedTest { + + private void test(final CSVFormat format, final String enumName) { + assertEquals(format, CSVFormat.Predefined.valueOf(enumName).getFormat()); + assertEquals(format, CSVFormat.valueOf(enumName)); + } + + @Test + void testDefault() { + test(CSVFormat.DEFAULT, "Default"); + } + + @Test + void testExcel() { + test(CSVFormat.EXCEL, "Excel"); + } + + @Test + void testMongoDbCsv() { + test(CSVFormat.MONGODB_CSV, "MongoDBCsv"); + } + + @Test + void testMongoDbTsv() { + test(CSVFormat.MONGODB_TSV, "MongoDBTsv"); + } + + @Test + void testMySQL() { + test(CSVFormat.MYSQL, "MySQL"); + } + + @Test + void testOracle() { + test(CSVFormat.ORACLE, "Oracle"); + } + + @Test + void testPostgreSqlCsv() { + test(CSVFormat.POSTGRESQL_CSV, "PostgreSQLCsv"); + } + + @Test + void testPostgreSqlText() { + test(CSVFormat.POSTGRESQL_TEXT, "PostgreSQLText"); + } + + @Test + void testRFC4180() { + test(CSVFormat.RFC4180, "RFC4180"); + } + + @Test + void testTDF() { + test(CSVFormat.TDF, "TDF"); + } +} diff --git a/src/test/java/org/apache/commons/csv/CSVFormatTest.java b/src/test/java/org/apache/commons/csv/CSVFormatTest.java index da734782f3..ed20898de9 100644 --- a/src/test/java/org/apache/commons/csv/CSVFormatTest.java +++ b/src/test/java/org/apache/commons/csv/CSVFormatTest.java @@ -1,18 +1,20 @@ /* - * Licensed to the Apache Software Foundation (ASF) under one or more - * contributor license agreements. See the NOTICE file distributed with - * this work for additional information regarding copyright ownership. - * The ASF licenses this file to You under the Apache License, Version 2.0 - * (the "License"); you may not use this file except in compliance with - * the License. You may obtain a copy of the License at + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at * - * http://www.apache.org/licenses/LICENSE-2.0 + * https://www.apache.org/licenses/LICENSE-2.0 * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. */ package org.apache.commons.csv; @@ -21,247 +23,713 @@ import static org.apache.commons.csv.Constants.CR; import static org.apache.commons.csv.Constants.CRLF; import static org.apache.commons.csv.Constants.LF; -import static org.junit.Assert.assertArrayEquals; -import static org.junit.Assert.assertEquals; -import static org.junit.Assert.assertFalse; -import static org.junit.Assert.assertNotNull; -import static org.junit.Assert.assertNotSame; -import static org.junit.Assert.assertTrue; +import static org.junit.jupiter.api.Assertions.assertArrayEquals; +import static org.junit.jupiter.api.Assertions.assertEquals; +import static org.junit.jupiter.api.Assertions.assertFalse; +import static org.junit.jupiter.api.Assertions.assertNotEquals; +import static org.junit.jupiter.api.Assertions.assertNotNull; +import static org.junit.jupiter.api.Assertions.assertNotSame; +import static org.junit.jupiter.api.Assertions.assertNull; +import static org.junit.jupiter.api.Assertions.assertThrows; +import static org.junit.jupiter.api.Assertions.assertTrue; +import static org.junit.jupiter.api.Assertions.fail; import java.io.ByteArrayInputStream; import java.io.ByteArrayOutputStream; +import java.io.IOException; import java.io.ObjectInputStream; import java.io.ObjectOutputStream; +import java.io.Reader; +import java.io.StringReader; +import java.lang.reflect.Method; +import java.lang.reflect.Modifier; +import java.sql.ResultSet; +import java.sql.SQLException; import java.util.Arrays; +import java.util.Objects; -import org.junit.Assert; -import org.junit.Test; +import org.apache.commons.csv.CSVFormat.Builder; +import org.junit.jupiter.api.Test; /** - * - * - * @version $Id$ + * Tests {@link CSVFormat}. */ -public class CSVFormatTest { +class CSVFormatTest { + + public enum EmptyEnum { + // empty enum. + } + + public enum Header { + Name, Email, Phone + } - private static void assertNotEquals(final Object right, final Object left) { - assertFalse(right.equals(left)); - assertFalse(left.equals(right)); + private static void assertNotEqualsFlip(final Object right, final Object left) { + assertNotEquals(right, left); + assertNotEquals(left, right); } private static CSVFormat copy(final CSVFormat format) { - return format.withDelimiter(format.getDelimiter()); + return format.builder().setDelimiter(format.getDelimiter()).get(); } - @Test(expected = IllegalArgumentException.class) - public void testDelimiterSameAsCommentStartThrowsException() { - CSVFormat.DEFAULT.withDelimiter('!').withCommentMarker('!'); + private void assertNotEqualsHash(final String name, final String type, final Object left, final Object right) { + if (left.equals(right) || right.equals(left)) { + fail("Objects must not compare equal for " + name + "(" + type + ")"); + } + if (left.hashCode() == right.hashCode()) { + fail("Hash code should not be equal for " + name + "(" + type + ")"); + } } - @Test(expected = IllegalArgumentException.class) - public void testDelimiterSameAsEscapeThrowsException() { - CSVFormat.DEFAULT.withDelimiter('!').withEscape('!'); + @Test + void testBuildVsGet() { + final Builder builder = CSVFormat.DEFAULT.builder(); + assertNotSame(builder.get(), builder.build()); } - @Test(expected = IllegalArgumentException.class) - public void testDuplicateHeaderElements() { - CSVFormat.DEFAULT.withHeader("A", "A"); + @Test + void testDelimiterCharLineBreakCrThrowsException1() { + assertThrows(IllegalArgumentException.class, () -> CSVFormat.DEFAULT.builder().setDelimiter(Constants.CR).get()); } @Test - public void testEquals() { - final CSVFormat right = CSVFormat.DEFAULT; - final CSVFormat left = copy(right); + void testDelimiterCharLineBreakLfThrowsException1() { + assertThrows(IllegalArgumentException.class, () -> CSVFormat.DEFAULT.builder().setDelimiter(Constants.LF).get()); + } + + @Test + void testDelimiterEmptyStringThrowsException1() { + assertThrows(IllegalArgumentException.class, () -> CSVFormat.DEFAULT.builder().setDelimiter("").get()); + } + + @SuppressWarnings("deprecation") + @Test + void testDelimiterSameAsCommentStartThrowsException_Deprecated() { + assertThrows(IllegalArgumentException.class, () -> CSVFormat.DEFAULT.withDelimiter('!').withCommentMarker('!')); + } + + @Test + void testDelimiterSameAsCommentStartThrowsException1() { + assertThrows(IllegalArgumentException.class, () -> CSVFormat.DEFAULT.builder().setDelimiter('!').setCommentMarker('!').get()); + } + + @SuppressWarnings("deprecation") + @Test + void testDelimiterSameAsEscapeThrowsException_Deprecated() { + assertThrows(IllegalArgumentException.class, () -> CSVFormat.DEFAULT.withDelimiter('!').withEscape('!')); + } - assertFalse(right.equals(null)); - assertFalse(right.equals("A String Instance")); + @Test + void testDelimiterSameAsEscapeThrowsException1() { + assertThrows(IllegalArgumentException.class, () -> CSVFormat.DEFAULT.builder().setDelimiter('!').setEscape('!').get()); + } + @Test + void testDelimiterSameAsRecordSeparatorThrowsException() { + assertThrows(IllegalArgumentException.class, () -> CSVFormat.newFormat(CR)); + } + + @Test + void testDelimiterStringLineBreakCrThrowsException1() { + assertThrows(IllegalArgumentException.class, () -> CSVFormat.DEFAULT.builder().setDelimiter(String.valueOf(Constants.CR)).get()); + } + + @Test + void testDelimiterStringLineBreakLfThrowsException1() { + assertThrows(IllegalArgumentException.class, () -> CSVFormat.DEFAULT.builder().setDelimiter(String.valueOf(Constants.LF)).get()); + } + + @Test + void testDuplicateHeaderElements() { + final String[] header = { "A", "A" }; + final CSVFormat format = CSVFormat.DEFAULT.builder().setHeader(header).get(); + assertEquals(2, format.getHeader().length); + assertArrayEquals(header, format.getHeader()); + } + + @SuppressWarnings("deprecation") + @Test + void testDuplicateHeaderElements_Deprecated() { + final String[] header = { "A", "A" }; + final CSVFormat format = CSVFormat.DEFAULT.withHeader(header); + assertEquals(2, format.getHeader().length); + assertArrayEquals(header, format.getHeader()); + } + + @Test + void testDuplicateHeaderElementsFalse() { + assertThrows(IllegalArgumentException.class, () -> CSVFormat.DEFAULT.builder().setAllowDuplicateHeaderNames(false).setHeader("A", "A").get()); + } + + @SuppressWarnings("deprecation") + @Test + void testDuplicateHeaderElementsFalse_Deprecated() { + assertThrows(IllegalArgumentException.class, () -> CSVFormat.DEFAULT.withAllowDuplicateHeaderNames(false).withHeader("A", "A")); + } + + @Test + void testDuplicateHeaderElementsTrue() { + CSVFormat.DEFAULT.builder().setAllowDuplicateHeaderNames(true).setHeader("A", "A").get(); + } + + @SuppressWarnings("deprecation") + @Test + void testDuplicateHeaderElementsTrue_Deprecated() { + CSVFormat.DEFAULT.withAllowDuplicateHeaderNames(true).withHeader("A", "A"); + } + + @Test + void testDuplicateHeaderElementsTrueContainsEmpty1() { + CSVFormat.DEFAULT.builder().setAllowDuplicateHeaderNames(false).setHeader("A", "", "B", "").get(); + } + + @Test + void testDuplicateHeaderElementsTrueContainsEmpty2() { + CSVFormat.DEFAULT.builder().setDuplicateHeaderMode(DuplicateHeaderMode.ALLOW_EMPTY).setHeader("A", "", "B", "").get(); + } + + @Test + void testDuplicateHeaderElementsTrueContainsEmpty3() { + CSVFormat.DEFAULT.builder().setAllowDuplicateHeaderNames(false).setAllowMissingColumnNames(true).setHeader("A", "", "B", "").get(); + } + + @Test + void testEquals() { + final CSVFormat right = CSVFormat.DEFAULT; + final CSVFormat left = copy(right); + assertNotEquals(null, right); + assertNotEquals("A String Instance", right); assertEquals(right, right); assertEquals(right, left); assertEquals(left, right); - assertEquals(right.hashCode(), right.hashCode()); assertEquals(right.hashCode(), left.hashCode()); } @Test - public void testEqualsCommentStart() { - final CSVFormat right = CSVFormat.newFormat('\'') - .withQuote('"') - .withCommentMarker('#') - .withQuoteMode(QuoteMode.ALL); - final CSVFormat left = right - .withCommentMarker('!'); + void testEqualsCommentStart() { + final CSVFormat right = CSVFormat.newFormat('\'').builder().setQuote('"').setCommentMarker('#').setQuoteMode(QuoteMode.ALL).get(); + final CSVFormat left = right.builder().setCommentMarker('!').get(); - assertNotEquals(right, left); + assertNotEqualsFlip(right, left); + } + + @SuppressWarnings("deprecation") + @Test + void testEqualsCommentStart_Deprecated() { + final CSVFormat right = CSVFormat.newFormat('\'').withQuote('"').withCommentMarker('#').withQuoteMode(QuoteMode.ALL); + final CSVFormat left = right.withCommentMarker('!'); + + assertNotEqualsFlip(right, left); } @Test - public void testEqualsDelimiter() { + void testEqualsDelimiter() { final CSVFormat right = CSVFormat.newFormat('!'); final CSVFormat left = CSVFormat.newFormat('?'); - assertNotEquals(right, left); + assertNotEqualsFlip(right, left); } @Test - public void testEqualsEscape() { - final CSVFormat right = CSVFormat.newFormat('\'') - .withQuote('"') - .withCommentMarker('#') - .withEscape('+') - .withQuoteMode(QuoteMode.ALL); - final CSVFormat left = right - .withEscape('!'); + void testEqualsEscape() { + final CSVFormat right = CSVFormat.newFormat('\'').builder().setQuote('"').setCommentMarker('#').setEscape('+').setQuoteMode(QuoteMode.ALL).get(); + final CSVFormat left = right.builder().setEscape('!').get(); - assertNotEquals(right, left); + assertNotEqualsFlip(right, left); } + @SuppressWarnings("deprecation") @Test - public void testEqualsHeader() { - final CSVFormat right = CSVFormat.newFormat('\'') - .withRecordSeparator(CR) - .withCommentMarker('#') - .withEscape('+') - .withHeader("One", "Two", "Three") - .withIgnoreEmptyLines() - .withIgnoreSurroundingSpaces() - .withQuote('"') - .withQuoteMode(QuoteMode.ALL); - final CSVFormat left = right - .withHeader("Three", "Two", "One"); + void testEqualsEscape_Deprecated() { + final CSVFormat right = CSVFormat.newFormat('\'').withQuote('"').withCommentMarker('#').withEscape('+').withQuoteMode(QuoteMode.ALL); + final CSVFormat left = right.withEscape('!'); - assertNotEquals(right, left); + assertNotEqualsFlip(right, left); } @Test - public void testEqualsIgnoreEmptyLines() { - final CSVFormat right = CSVFormat.newFormat('\'') - .withCommentMarker('#') - .withEscape('+') - .withIgnoreEmptyLines() - .withIgnoreSurroundingSpaces() - .withQuote('"') - .withQuoteMode(QuoteMode.ALL); - final CSVFormat left = right - .withIgnoreEmptyLines(false); + void testEqualsHash() throws Exception { + final Method[] methods = CSVFormat.class.getDeclaredMethods(); + for (final Method method : methods) { + if (Modifier.isPublic(method.getModifiers())) { + final String name = method.getName(); + if (name.startsWith("with")) { + for (final Class cls : method.getParameterTypes()) { + final String type = cls.getCanonicalName(); + switch (type) { + case "boolean": { + final Object defTrue = method.invoke(CSVFormat.DEFAULT, Boolean.TRUE); + final Object defFalse = method.invoke(CSVFormat.DEFAULT, Boolean.FALSE); + assertNotEqualsHash(name, type, defTrue, defFalse); + break; + } + case "char": { + final Object a = method.invoke(CSVFormat.DEFAULT, 'a'); + final Object b = method.invoke(CSVFormat.DEFAULT, 'b'); + assertNotEqualsHash(name, type, a, b); + break; + } + case "java.lang.Character": { + final Object a = method.invoke(CSVFormat.DEFAULT, new Object[] { null }); + final Object b = method.invoke(CSVFormat.DEFAULT, Character.valueOf('d')); + assertNotEqualsHash(name, type, a, b); + break; + } + case "java.lang.String": { + final Object a = method.invoke(CSVFormat.DEFAULT, new Object[] { null }); + final Object b = method.invoke(CSVFormat.DEFAULT, "e"); + assertNotEqualsHash(name, type, a, b); + break; + } + case "java.lang.String[]": { + final Object a = method.invoke(CSVFormat.DEFAULT, new Object[] { new String[] { null, null } }); + final Object b = method.invoke(CSVFormat.DEFAULT, new Object[] { new String[] { "f", "g" } }); + assertNotEqualsHash(name, type, a, b); + break; + } + case "org.apache.commons.csv.QuoteMode": { + final Object a = method.invoke(CSVFormat.DEFAULT, QuoteMode.MINIMAL); + final Object b = method.invoke(CSVFormat.DEFAULT, QuoteMode.ALL); + assertNotEqualsHash(name, type, a, b); + break; + } + case "org.apache.commons.csv.DuplicateHeaderMode": { + final Object a = method.invoke(CSVFormat.DEFAULT, DuplicateHeaderMode.ALLOW_ALL); + final Object b = method.invoke(CSVFormat.DEFAULT, DuplicateHeaderMode.DISALLOW); + assertNotEqualsHash(name, type, a, b); + break; + } + case "java.lang.Object[]": { + final Object a = method.invoke(CSVFormat.DEFAULT, new Object[] { new Object[] { null, null } }); + final Object b = method.invoke(CSVFormat.DEFAULT, new Object[] { new Object[] { new Object(), new Object() } }); + assertNotEqualsHash(name, type, a, b); + break; + } + default: + if ("withHeader".equals(name)) { // covered above by String[] + // ignored + } else { + fail("Unhandled method: " + name + "(" + type + ")"); + } + break; + } + } + } + } + } + } - assertNotEquals(right, left); + @Test + void testEqualsHeader() { + final CSVFormat right = CSVFormat.newFormat('\'').builder().setRecordSeparator(CR).setCommentMarker('#').setEscape('+').setHeader("One", "Two", "Three") + .setIgnoreEmptyLines(true).setIgnoreSurroundingSpaces(true).setQuote('"').setQuoteMode(QuoteMode.ALL).get(); + final CSVFormat left = right.builder().setHeader("Three", "Two", "One").get(); + + assertNotEqualsFlip(right, left); } + @SuppressWarnings("deprecation") @Test - public void testEqualsIgnoreSurroundingSpaces() { - final CSVFormat right = CSVFormat.newFormat('\'') - .withCommentMarker('#') - .withEscape('+') - .withIgnoreSurroundingSpaces() - .withQuote('"') + void testEqualsHeader_Deprecated() { + final CSVFormat right = CSVFormat.newFormat('\'').withRecordSeparator(CR).withCommentMarker('#').withEscape('+').withHeader("One", "Two", "Three") + .withIgnoreEmptyLines().withIgnoreSurroundingSpaces().withQuote('"').withQuoteMode(QuoteMode.ALL); + final CSVFormat left = right.withHeader("Three", "Two", "One"); + + assertNotEqualsFlip(right, left); + } + + @Test + void testEqualsIgnoreEmptyLines() { + final CSVFormat right = CSVFormat.newFormat('\'').builder().setCommentMarker('#').setEscape('+').setIgnoreEmptyLines(true) + .setIgnoreSurroundingSpaces(true).setQuote('"').setQuoteMode(QuoteMode.ALL).get(); + final CSVFormat left = right.builder().setIgnoreEmptyLines(false).get(); + + assertNotEqualsFlip(right, left); + } + + @SuppressWarnings("deprecation") + @Test + void testEqualsIgnoreEmptyLines_Deprecated() { + final CSVFormat right = CSVFormat.newFormat('\'').withCommentMarker('#').withEscape('+').withIgnoreEmptyLines().withIgnoreSurroundingSpaces() + .withQuote('"').withQuoteMode(QuoteMode.ALL); + final CSVFormat left = right.withIgnoreEmptyLines(false); + + assertNotEqualsFlip(right, left); + } + + @Test + void testEqualsIgnoreSurroundingSpaces() { + final CSVFormat right = CSVFormat.newFormat('\'').builder().setCommentMarker('#').setEscape('+').setIgnoreSurroundingSpaces(true).setQuote('"') + .setQuoteMode(QuoteMode.ALL).get(); + final CSVFormat left = right.builder().setIgnoreSurroundingSpaces(false).get(); + + assertNotEqualsFlip(right, left); + } + + @SuppressWarnings("deprecation") + @Test + void testEqualsIgnoreSurroundingSpaces_Deprecated() { + final CSVFormat right = CSVFormat.newFormat('\'').withCommentMarker('#').withEscape('+').withIgnoreSurroundingSpaces().withQuote('"') .withQuoteMode(QuoteMode.ALL); - final CSVFormat left = right - .withIgnoreSurroundingSpaces(false); + final CSVFormat left = right.withIgnoreSurroundingSpaces(false); - assertNotEquals(right, left); + assertNotEqualsFlip(right, left); + } + + @Test + void testEqualsLeftNoQuoteRightQuote() { + final CSVFormat left = CSVFormat.newFormat(',').builder().setQuote(null).get(); + final CSVFormat right = left.builder().setQuote('#').get(); + + assertNotEqualsFlip(left, right); + } + + @SuppressWarnings("deprecation") + @Test + void testEqualsLeftNoQuoteRightQuote_Deprecated() { + final CSVFormat left = CSVFormat.newFormat(',').withQuote(null); + final CSVFormat right = left.withQuote('#'); + + assertNotEqualsFlip(left, right); + } + + @Test + void testEqualsMaxRows() { + final CSVFormat right = CSVFormat.DEFAULT.builder().setMaxRows(10).get(); + final CSVFormat left = CSVFormat.DEFAULT.builder().setMaxRows(1000).get(); + assertNotEqualsFlip(right, left); + assertNotEquals(right.hashCode(), left.hashCode()); + } + + @Test + void testEqualsNoQuotes() { + final CSVFormat left = CSVFormat.newFormat(',').builder().setQuote(null).get(); + final CSVFormat right = left.builder().setQuote(null).get(); + + assertEquals(left, right); + } + + @SuppressWarnings("deprecation") + @Test + void testEqualsNoQuotes_Deprecated() { + final CSVFormat left = CSVFormat.newFormat(',').withQuote(null); + final CSVFormat right = left.withQuote(null); + + assertEquals(left, right); + } + + @Test + void testEqualsNullString() { + final CSVFormat right = CSVFormat.newFormat('\'').builder().setRecordSeparator(CR).setCommentMarker('#').setEscape('+').setIgnoreEmptyLines(true) + .setIgnoreSurroundingSpaces(true).setQuote('"').setQuoteMode(QuoteMode.ALL).setNullString("null").get(); + final CSVFormat left = right.builder().setNullString("---").get(); + + assertNotEqualsFlip(right, left); + } + + @SuppressWarnings("deprecation") + @Test + void testEqualsNullString_Deprecated() { + final CSVFormat right = CSVFormat.newFormat('\'').withRecordSeparator(CR).withCommentMarker('#').withEscape('+').withIgnoreEmptyLines() + .withIgnoreSurroundingSpaces().withQuote('"').withQuoteMode(QuoteMode.ALL).withNullString("null"); + final CSVFormat left = right.withNullString("---"); + + assertNotEqualsFlip(right, left); + } + + @Test + void testEqualsOne() { + + final CSVFormat csvFormatOne = CSVFormat.INFORMIX_UNLOAD; + final CSVFormat csvFormatTwo = CSVFormat.MYSQL; + + assertEquals('\\', (char) csvFormatOne.getEscapeCharacter()); + assertEquals('\\', csvFormatOne.getEscapeChar()); + assertNull(csvFormatOne.getQuoteMode()); + + assertTrue(csvFormatOne.getIgnoreEmptyLines()); + assertFalse(csvFormatOne.getSkipHeaderRecord()); + + assertFalse(csvFormatOne.getIgnoreHeaderCase()); + assertNull(csvFormatOne.getCommentMarker()); + + assertFalse(csvFormatOne.isCommentMarkerSet()); + assertTrue(csvFormatOne.isQuoteCharacterSet()); + + assertEquals('|', csvFormatOne.getDelimiter()); + assertFalse(csvFormatOne.getAllowMissingColumnNames()); + + assertTrue(csvFormatOne.isEscapeCharacterSet()); + assertEquals("\n", csvFormatOne.getRecordSeparator()); + + assertEquals('\"', (char) csvFormatOne.getQuoteCharacter()); + assertFalse(csvFormatOne.getTrailingDelimiter()); + + assertFalse(csvFormatOne.getTrim()); + assertFalse(csvFormatOne.isNullStringSet()); + + assertNull(csvFormatOne.getNullString()); + assertFalse(csvFormatOne.getIgnoreSurroundingSpaces()); + + assertTrue(csvFormatTwo.isEscapeCharacterSet()); + assertNull(csvFormatTwo.getQuoteCharacter()); + + assertFalse(csvFormatTwo.getAllowMissingColumnNames()); + assertEquals(QuoteMode.ALL_NON_NULL, csvFormatTwo.getQuoteMode()); + + assertEquals('\t', csvFormatTwo.getDelimiter()); + assertArrayEquals(new char[] { '\t' }, csvFormatTwo.getDelimiterCharArray()); + assertEquals("\t", csvFormatTwo.getDelimiterString()); + assertEquals("\n", csvFormatTwo.getRecordSeparator()); + + assertFalse(csvFormatTwo.isQuoteCharacterSet()); + assertTrue(csvFormatTwo.isNullStringSet()); + + assertEquals('\\', (char) csvFormatTwo.getEscapeCharacter()); + assertFalse(csvFormatTwo.getIgnoreHeaderCase()); + + assertFalse(csvFormatTwo.getTrim()); + assertFalse(csvFormatTwo.getIgnoreEmptyLines()); + + assertEquals("\\N", csvFormatTwo.getNullString()); + assertFalse(csvFormatTwo.getIgnoreSurroundingSpaces()); + + assertFalse(csvFormatTwo.getTrailingDelimiter()); + assertFalse(csvFormatTwo.getSkipHeaderRecord()); + + assertNull(csvFormatTwo.getCommentMarker()); + assertFalse(csvFormatTwo.isCommentMarkerSet()); + + assertNotSame(csvFormatTwo, csvFormatOne); + assertNotEquals(csvFormatTwo, csvFormatOne); + + assertEquals('\\', (char) csvFormatOne.getEscapeCharacter()); + assertNull(csvFormatOne.getQuoteMode()); + + assertTrue(csvFormatOne.getIgnoreEmptyLines()); + assertFalse(csvFormatOne.getSkipHeaderRecord()); + + assertFalse(csvFormatOne.getIgnoreHeaderCase()); + assertNull(csvFormatOne.getCommentMarker()); + + assertFalse(csvFormatOne.isCommentMarkerSet()); + assertTrue(csvFormatOne.isQuoteCharacterSet()); + + assertEquals('|', csvFormatOne.getDelimiter()); + assertFalse(csvFormatOne.getAllowMissingColumnNames()); + + assertTrue(csvFormatOne.isEscapeCharacterSet()); + assertEquals("\n", csvFormatOne.getRecordSeparator()); + + assertEquals('\"', (char) csvFormatOne.getQuoteCharacter()); + assertFalse(csvFormatOne.getTrailingDelimiter()); + + assertFalse(csvFormatOne.getTrim()); + assertFalse(csvFormatOne.isNullStringSet()); + + assertNull(csvFormatOne.getNullString()); + assertFalse(csvFormatOne.getIgnoreSurroundingSpaces()); + + assertTrue(csvFormatTwo.isEscapeCharacterSet()); + assertNull(csvFormatTwo.getQuoteCharacter()); + + assertFalse(csvFormatTwo.getAllowMissingColumnNames()); + assertEquals(QuoteMode.ALL_NON_NULL, csvFormatTwo.getQuoteMode()); + + assertEquals('\t', csvFormatTwo.getDelimiter()); + assertEquals("\n", csvFormatTwo.getRecordSeparator()); + + assertFalse(csvFormatTwo.isQuoteCharacterSet()); + assertTrue(csvFormatTwo.isNullStringSet()); + + assertEquals('\\', (char) csvFormatTwo.getEscapeCharacter()); + assertFalse(csvFormatTwo.getIgnoreHeaderCase()); + + assertFalse(csvFormatTwo.getTrim()); + assertFalse(csvFormatTwo.getIgnoreEmptyLines()); + + assertEquals("\\N", csvFormatTwo.getNullString()); + assertFalse(csvFormatTwo.getIgnoreSurroundingSpaces()); + + assertFalse(csvFormatTwo.getTrailingDelimiter()); + assertFalse(csvFormatTwo.getSkipHeaderRecord()); + + assertNull(csvFormatTwo.getCommentMarker()); + assertFalse(csvFormatTwo.isCommentMarkerSet()); + + assertNotSame(csvFormatOne, csvFormatTwo); + assertNotSame(csvFormatTwo, csvFormatOne); + + assertNotEquals(csvFormatOne, csvFormatTwo); + assertNotEquals(csvFormatTwo, csvFormatOne); + + assertNotEquals(csvFormatTwo, csvFormatOne); + + } + + @Test + void testEqualsQuoteChar() { + final CSVFormat right = CSVFormat.newFormat('\'').builder().setQuote('"').get(); + final CSVFormat left = right.builder().setQuote('!').get(); + + assertNotEqualsFlip(right, left); } + @SuppressWarnings("deprecation") @Test - public void testEqualsQuoteChar() { + void testEqualsQuoteChar_Deprecated() { final CSVFormat right = CSVFormat.newFormat('\'').withQuote('"'); final CSVFormat left = right.withQuote('!'); - assertNotEquals(right, left); + assertNotEqualsFlip(right, left); } @Test - public void testEqualsLeftNoQuoteRightQuote() { - final CSVFormat left = CSVFormat.newFormat(',').withQuote(null); - final CSVFormat right = left.withQuote('#'); - - assertNotEquals(left, right); + void testEqualsQuotePolicy() { + final CSVFormat right = CSVFormat.newFormat('\'').builder().setQuote('"').setQuoteMode(QuoteMode.ALL).get(); + final CSVFormat left = right.builder().setQuoteMode(QuoteMode.MINIMAL).get(); + + assertNotEqualsFlip(right, left); } + @SuppressWarnings("deprecation") @Test - public void testEqualsNoQuotes() { - final CSVFormat left = CSVFormat.newFormat(',').withQuote(null); - final CSVFormat right = left.withQuote(null); + void testEqualsQuotePolicy_Deprecated() { + final CSVFormat right = CSVFormat.newFormat('\'').withQuote('"').withQuoteMode(QuoteMode.ALL); + final CSVFormat left = right.withQuoteMode(QuoteMode.MINIMAL); - assertEquals(left, right); + assertNotEqualsFlip(right, left); } @Test - public void testEqualsQuotePolicy() { - final CSVFormat right = CSVFormat.newFormat('\'') - .withQuote('"') - .withQuoteMode(QuoteMode.ALL); - final CSVFormat left = right - .withQuoteMode(QuoteMode.MINIMAL); + void testEqualsRecordSeparator() { + final CSVFormat right = CSVFormat.newFormat('\'').builder().setRecordSeparator(CR).setCommentMarker('#').setEscape('+').setIgnoreEmptyLines(true) + .setIgnoreSurroundingSpaces(true).setQuote('"').setQuoteMode(QuoteMode.ALL).get(); + final CSVFormat left = right.builder().setRecordSeparator(LF).get(); - assertNotEquals(right, left); + assertNotEqualsFlip(right, left); } + @SuppressWarnings("deprecation") @Test - public void testEqualsRecordSeparator() { - final CSVFormat right = CSVFormat.newFormat('\'') - .withRecordSeparator(CR) - .withCommentMarker('#') - .withEscape('+') - .withIgnoreEmptyLines() - .withIgnoreSurroundingSpaces() - .withQuote('"') - .withQuoteMode(QuoteMode.ALL); - final CSVFormat left = right - .withRecordSeparator(LF); + void testEqualsRecordSeparator_Deprecated() { + final CSVFormat right = CSVFormat.newFormat('\'').withRecordSeparator(CR).withCommentMarker('#').withEscape('+').withIgnoreEmptyLines() + .withIgnoreSurroundingSpaces().withQuote('"').withQuoteMode(QuoteMode.ALL); + final CSVFormat left = right.withRecordSeparator(LF); - assertNotEquals(right, left); + assertNotEqualsFlip(right, left); } + void testEqualsSkipHeaderRecord() { + final CSVFormat right = CSVFormat.newFormat('\'').builder().setRecordSeparator(CR).setCommentMarker('#').setEscape('+').setIgnoreEmptyLines(true) + .setIgnoreSurroundingSpaces(true).setQuote('"').setQuoteMode(QuoteMode.ALL).setNullString("null").setSkipHeaderRecord(true).get(); + final CSVFormat left = right.builder().setSkipHeaderRecord(false).get(); + + assertNotEqualsFlip(right, left); + } + + @SuppressWarnings("deprecation") @Test - public void testEqualsNullString() { - final CSVFormat right = CSVFormat.newFormat('\'') - .withRecordSeparator(CR) - .withCommentMarker('#') - .withEscape('+') - .withIgnoreEmptyLines() - .withIgnoreSurroundingSpaces() - .withQuote('"') - .withQuoteMode(QuoteMode.ALL) - .withNullString("null"); - final CSVFormat left = right - .withNullString("---"); + void testEqualsSkipHeaderRecord_Deprecated() { + final CSVFormat right = CSVFormat.newFormat('\'').withRecordSeparator(CR).withCommentMarker('#').withEscape('+').withIgnoreEmptyLines() + .withIgnoreSurroundingSpaces().withQuote('"').withQuoteMode(QuoteMode.ALL).withNullString("null").withSkipHeaderRecord(); + final CSVFormat left = right.withSkipHeaderRecord(false); - assertNotEquals(right, left); + assertNotEqualsFlip(right, left); } @Test - public void testEqualsSkipHeaderRecord() { - final CSVFormat right = CSVFormat.newFormat('\'') - .withRecordSeparator(CR) - .withCommentMarker('#') - .withEscape('+') - .withIgnoreEmptyLines() - .withIgnoreSurroundingSpaces() - .withQuote('"') - .withQuoteMode(QuoteMode.ALL) - .withNullString("null") - .withSkipHeaderRecord(); - final CSVFormat left = right - .withSkipHeaderRecord(false); + void testEqualsWithNull() { - assertNotEquals(right, left); + final CSVFormat csvFormat = CSVFormat.POSTGRESQL_TEXT; + + assertEquals('\\', (char) csvFormat.getEscapeCharacter()); + assertFalse(csvFormat.getIgnoreSurroundingSpaces()); + + assertFalse(csvFormat.getTrailingDelimiter()); + assertFalse(csvFormat.getTrim()); + + assertFalse(csvFormat.isQuoteCharacterSet()); + assertEquals("\\N", csvFormat.getNullString()); + + assertFalse(csvFormat.getIgnoreHeaderCase()); + assertTrue(csvFormat.isEscapeCharacterSet()); + + assertFalse(csvFormat.isCommentMarkerSet()); + assertNull(csvFormat.getCommentMarker()); + + assertFalse(csvFormat.getAllowMissingColumnNames()); + assertEquals(QuoteMode.ALL_NON_NULL, csvFormat.getQuoteMode()); + + assertEquals('\t', csvFormat.getDelimiter()); + assertFalse(csvFormat.getSkipHeaderRecord()); + + assertEquals("\n", csvFormat.getRecordSeparator()); + assertFalse(csvFormat.getIgnoreEmptyLines()); + + assertNull(csvFormat.getQuoteCharacter()); + assertTrue(csvFormat.isNullStringSet()); + + assertEquals('\\', (char) csvFormat.getEscapeCharacter()); + assertFalse(csvFormat.getIgnoreSurroundingSpaces()); + + assertFalse(csvFormat.getTrailingDelimiter()); + assertFalse(csvFormat.getTrim()); + + assertFalse(csvFormat.isQuoteCharacterSet()); + assertEquals("\\N", csvFormat.getNullString()); + + assertFalse(csvFormat.getIgnoreHeaderCase()); + assertTrue(csvFormat.isEscapeCharacterSet()); + + assertFalse(csvFormat.isCommentMarkerSet()); + assertNull(csvFormat.getCommentMarker()); + + assertFalse(csvFormat.getAllowMissingColumnNames()); + assertEquals(QuoteMode.ALL_NON_NULL, csvFormat.getQuoteMode()); + + assertEquals('\t', csvFormat.getDelimiter()); + assertFalse(csvFormat.getSkipHeaderRecord()); + + assertEquals("\n", csvFormat.getRecordSeparator()); + assertFalse(csvFormat.getIgnoreEmptyLines()); + + assertNull(csvFormat.getQuoteCharacter()); + assertTrue(csvFormat.isNullStringSet()); + + assertNotEquals(null, csvFormat); + + } + + @Test + void testEscapeSameAsCommentStartThrowsException() { + assertThrows(IllegalArgumentException.class, () -> CSVFormat.DEFAULT.builder().setEscape('!').setCommentMarker('!').get()); + } + + @SuppressWarnings("deprecation") + @Test + void testEscapeSameAsCommentStartThrowsException_Deprecated() { + assertThrows(IllegalArgumentException.class, () -> CSVFormat.DEFAULT.withEscape('!').withCommentMarker('!')); } - @Test(expected = IllegalArgumentException.class) - public void testEscapeSameAsCommentStartThrowsException() { - CSVFormat.DEFAULT.withEscape('!').withCommentMarker('!'); + @Test + void testEscapeSameAsCommentStartThrowsExceptionForWrapperType() { + // Cannot assume that callers won't use different Character objects + assertThrows(IllegalArgumentException.class, + () -> CSVFormat.DEFAULT.builder().setEscape(Character.valueOf('!')).setCommentMarker(Character.valueOf('!')).get()); } - @Test(expected = IllegalArgumentException.class) - public void testEscapeSameAsCommentStartThrowsExceptionForWrapperType() { + @SuppressWarnings("deprecation") + @Test + void testEscapeSameAsCommentStartThrowsExceptionForWrapperType_Deprecated() { // Cannot assume that callers won't use different Character objects - CSVFormat.DEFAULT.withEscape(new Character('!')).withCommentMarker(new Character('!')); + assertThrows(IllegalArgumentException.class, () -> CSVFormat.DEFAULT.withEscape(Character.valueOf('!')).withCommentMarker(Character.valueOf('!'))); } @Test - public void testFormat() { + void testFormat() { final CSVFormat format = CSVFormat.DEFAULT; assertEquals("", format.format()); @@ -269,9 +737,57 @@ public void testFormat() { assertEquals("\"x,y\",z", format.format("x,y", "z")); } + @Test // I assume this to be a defect. + void testFormatThrowsNullPointerException() { + + final CSVFormat csvFormat = CSVFormat.MYSQL; + + final NullPointerException e = assertThrows(NullPointerException.class, () -> csvFormat.format((Object[]) null)); + assertEquals(Objects.class.getName(), e.getStackTrace()[0].getClassName()); + } + + @Test + void testFormatToString() { + // @formatter:off + final CSVFormat format = CSVFormat.RFC4180 + .withEscape('?') + .withDelimiter(',') + .withQuoteMode(QuoteMode.MINIMAL) + .withRecordSeparator(CRLF) + .withQuote('"') + .withNullString("") + .withIgnoreHeaderCase(true) + .withHeaderComments("This is HeaderComments") + .withHeader("col1", "col2", "col3"); + // @formatter:on + assertEquals( + "Delimiter=<,> Escape= QuoteChar=<\"> QuoteMode= NullString=<> RecordSeparator=<" + CRLF + + "> IgnoreHeaderCase:ignored SkipHeaderRecord:false HeaderComments:[This is HeaderComments] Header:[col1, col2, col3]", + format.toString()); + } + + @Test + void testGetAllowDuplicateHeaderNames() { + final Builder builder = CSVFormat.DEFAULT.builder(); + assertTrue(builder.get().getAllowDuplicateHeaderNames()); + assertTrue(builder.setDuplicateHeaderMode(DuplicateHeaderMode.ALLOW_ALL).get().getAllowDuplicateHeaderNames()); + assertFalse(builder.setDuplicateHeaderMode(DuplicateHeaderMode.ALLOW_EMPTY).get().getAllowDuplicateHeaderNames()); + assertFalse(builder.setDuplicateHeaderMode(DuplicateHeaderMode.DISALLOW).get().getAllowDuplicateHeaderNames()); + } + @Test - public void testGetHeader() throws Exception { - final String[] header = new String[]{"one", "two", "three"}; + void testGetDuplicateHeaderMode() { + final Builder builder = CSVFormat.DEFAULT.builder(); + + assertEquals(DuplicateHeaderMode.ALLOW_ALL, builder.get().getDuplicateHeaderMode()); + assertEquals(DuplicateHeaderMode.ALLOW_ALL, builder.setDuplicateHeaderMode(DuplicateHeaderMode.ALLOW_ALL).get().getDuplicateHeaderMode()); + assertEquals(DuplicateHeaderMode.ALLOW_EMPTY, builder.setDuplicateHeaderMode(DuplicateHeaderMode.ALLOW_EMPTY).get().getDuplicateHeaderMode()); + assertEquals(DuplicateHeaderMode.DISALLOW, builder.setDuplicateHeaderMode(DuplicateHeaderMode.DISALLOW).get().getDuplicateHeaderMode()); + } + + @Test + void testGetHeader() { + final String[] header = { "one", "two", "three" }; final CSVFormat formatWithHeader = CSVFormat.DEFAULT.withHeader(header); // getHeader() makes a copy of the header array. final String[] headerCopy = formatWithHeader.getHeader(); @@ -283,51 +799,297 @@ public void testGetHeader() throws Exception { } @Test - public void testNullRecordSeparatorCsv106() { + void testHashCodeAndWithIgnoreHeaderCase() { + + final CSVFormat csvFormat = CSVFormat.INFORMIX_UNLOAD_CSV; + final CSVFormat csvFormatTwo = csvFormat.withIgnoreHeaderCase(); + csvFormatTwo.hashCode(); + + assertFalse(csvFormat.getIgnoreHeaderCase()); + assertTrue(csvFormatTwo.getIgnoreHeaderCase()); // now different + assertFalse(csvFormatTwo.getTrailingDelimiter()); + + assertNotEquals(csvFormatTwo, csvFormat); // CSV-244 - should not be equal + assertFalse(csvFormatTwo.getAllowMissingColumnNames()); + + assertFalse(csvFormatTwo.getTrim()); + + } + + @Test + void testJiraCsv236() { + CSVFormat.DEFAULT.builder().setAllowDuplicateHeaderNames(true).setHeader("CC", "VV", "VV").get(); + } + + @SuppressWarnings("deprecation") + @Test + void testJiraCsv236__Deprecated() { + CSVFormat.DEFAULT.withAllowDuplicateHeaderNames().withHeader("CC", "VV", "VV"); + } + + @Test + void testNewFormat() { + + final CSVFormat csvFormat = CSVFormat.newFormat('X'); + + assertFalse(csvFormat.getSkipHeaderRecord()); + assertFalse(csvFormat.isEscapeCharacterSet()); + + assertNull(csvFormat.getRecordSeparator()); + assertNull(csvFormat.getQuoteMode()); + + assertNull(csvFormat.getCommentMarker()); + assertFalse(csvFormat.getIgnoreHeaderCase()); + + assertFalse(csvFormat.getAllowMissingColumnNames()); + assertFalse(csvFormat.getTrim()); + + assertFalse(csvFormat.isNullStringSet()); + assertNull(csvFormat.getEscapeCharacter()); + + assertFalse(csvFormat.getIgnoreSurroundingSpaces()); + assertFalse(csvFormat.getTrailingDelimiter()); + + assertEquals('X', csvFormat.getDelimiter()); + assertNull(csvFormat.getNullString()); + + assertFalse(csvFormat.isQuoteCharacterSet()); + assertFalse(csvFormat.isCommentMarkerSet()); + + assertNull(csvFormat.getQuoteCharacter()); + assertFalse(csvFormat.getIgnoreEmptyLines()); + + assertFalse(csvFormat.getSkipHeaderRecord()); + assertFalse(csvFormat.isEscapeCharacterSet()); + + assertNull(csvFormat.getRecordSeparator()); + assertNull(csvFormat.getQuoteMode()); + + assertNull(csvFormat.getCommentMarker()); + assertFalse(csvFormat.getIgnoreHeaderCase()); + + assertFalse(csvFormat.getAllowMissingColumnNames()); + assertFalse(csvFormat.getTrim()); + + assertFalse(csvFormat.isNullStringSet()); + assertNull(csvFormat.getEscapeCharacter()); + + assertFalse(csvFormat.getIgnoreSurroundingSpaces()); + assertFalse(csvFormat.getTrailingDelimiter()); + + assertEquals('X', csvFormat.getDelimiter()); + assertNull(csvFormat.getNullString()); + + assertFalse(csvFormat.isQuoteCharacterSet()); + assertFalse(csvFormat.isCommentMarkerSet()); + + assertNull(csvFormat.getQuoteCharacter()); + assertFalse(csvFormat.getIgnoreEmptyLines()); + + } + + @Test + void testNullRecordSeparatorCsv106() { + final CSVFormat format = CSVFormat.newFormat(';').builder().setSkipHeaderRecord(true).setHeader("H1", "H2").get(); + final String formatStr = format.format("A", "B"); + assertNotNull(formatStr); + assertFalse(formatStr.endsWith("null")); + } + + @SuppressWarnings("deprecation") + @Test + void testNullRecordSeparatorCsv106__Deprecated() { final CSVFormat format = CSVFormat.newFormat(';').withSkipHeaderRecord().withHeader("H1", "H2"); final String formatStr = format.format("A", "B"); assertNotNull(formatStr); assertFalse(formatStr.endsWith("null")); } - @Test(expected = IllegalArgumentException.class) - public void testQuoteCharSameAsCommentStartThrowsException() { - CSVFormat.DEFAULT.withQuote('!').withCommentMarker('!'); + @Test + void testPrintRecord() throws IOException { + final Appendable out = new StringBuilder(); + final CSVFormat format = CSVFormat.RFC4180; + format.printRecord(out, "a", "b", "c"); + assertEquals("a,b,c" + format.getRecordSeparator(), out.toString()); + } + + @Test + void testPrintRecordEmpty() throws IOException { + final Appendable out = new StringBuilder(); + final CSVFormat format = CSVFormat.RFC4180; + format.printRecord(out); + assertEquals(format.getRecordSeparator(), out.toString()); + } + + @Test + void testPrintWithEscapesEndWithCRLF() throws IOException { + final Reader in = new StringReader("x,y,x\r\na,?b,c\r\n"); + final Appendable out = new StringBuilder(); + final CSVFormat format = CSVFormat.RFC4180.withEscape('?').withDelimiter(',').withQuote(null).withRecordSeparator(CRLF); + format.print(in, out, true); + assertEquals("x?,y?,x?r?na?,??b?,c?r?n", out.toString()); + } + + @Test + void testPrintWithEscapesEndWithoutCRLF() throws IOException { + final Reader in = new StringReader("x,y,x"); + final Appendable out = new StringBuilder(); + final CSVFormat format = CSVFormat.RFC4180.withEscape('?').withDelimiter(',').withQuote(null).withRecordSeparator(CRLF); + format.print(in, out, true); + assertEquals("x?,y?,x", out.toString()); + } + + @Test + void testPrintWithoutQuotes() throws IOException { + final Reader in = new StringReader(""); + final Appendable out = new StringBuilder(); + final CSVFormat format = CSVFormat.RFC4180.withDelimiter(',').withQuote('"').withEscape('?').withQuoteMode(QuoteMode.NON_NUMERIC); + format.print(in, out, true); + assertEquals("\"\"", out.toString()); + } + + @Test + void testPrintWithQuoteModeIsNONE() throws IOException { + final Reader in = new StringReader("a,b,c"); + final Appendable out = new StringBuilder(); + final CSVFormat format = CSVFormat.RFC4180.withDelimiter(',').withQuote('"').withEscape('?').withQuoteMode(QuoteMode.NONE); + format.print(in, out, true); + assertEquals("a?,b?,c", out.toString()); + } + + @Test + void testPrintWithQuotes() throws IOException { + final Reader in = new StringReader("\"a,b,c\r\nx,y,z"); + final Appendable out = new StringBuilder(); + final CSVFormat format = CSVFormat.RFC4180.withDelimiter(',').withQuote('"').withEscape('?').withQuoteMode(QuoteMode.NON_NUMERIC); + format.print(in, out, true); + assertEquals("\"\"\"a,b,c\r\nx,y,z\"", out.toString()); + } + + /** + * Tests CSV-326. + */ + @Test + void testPrintWithQuotesEscapeBeforeQuote() throws IOException { + final CSVFormat format = CSVFormat.DEFAULT.builder() + .setEscape('\\') + .setQuote('"') + .get(); + final String value = "\\\""; + final Appendable out = new StringBuilder(); + format.print(new StringReader(value), out, true); + try (CSVParser parser = CSVParser.parse(out.toString(), format)) { + assertEquals(value, parser.getRecords().get(0).get(0)); + } } - @Test(expected = IllegalArgumentException.class) - public void testQuoteCharSameAsCommentStartThrowsExceptionForWrapperType() { + @Test + void testQuoteCharSameAsCommentStartThrowsException() { + assertThrows(IllegalArgumentException.class, () -> CSVFormat.DEFAULT.builder().setQuote('!').setCommentMarker('!').get()); + } + + @SuppressWarnings("deprecation") + @Test + void testQuoteCharSameAsCommentStartThrowsException_Deprecated() { + assertThrows(IllegalArgumentException.class, () -> CSVFormat.DEFAULT.withQuote('!').withCommentMarker('!')); + } + + @Test + void testQuoteCharSameAsCommentStartThrowsExceptionForWrapperType() { + // Cannot assume that callers won't use different Character objects + assertThrows(IllegalArgumentException.class, () -> CSVFormat.DEFAULT.builder().setQuote(Character.valueOf('!')).setCommentMarker('!').get()); + } + + @SuppressWarnings("deprecation") + @Test + void testQuoteCharSameAsCommentStartThrowsExceptionForWrapperType_Deprecated() { // Cannot assume that callers won't use different Character objects - CSVFormat.DEFAULT.withQuote(new Character('!')).withCommentMarker('!'); + assertThrows(IllegalArgumentException.class, () -> CSVFormat.DEFAULT.withQuote(Character.valueOf('!')).withCommentMarker('!')); + } + + @Test + void testQuoteCharSameAsDelimiterThrowsException() { + assertThrows(IllegalArgumentException.class, () -> CSVFormat.DEFAULT.builder().setQuote('!').setDelimiter('!').get()); } - @Test(expected = IllegalArgumentException.class) - public void testQuoteCharSameAsDelimiterThrowsException() { - CSVFormat.DEFAULT.withQuote('!').withDelimiter('!'); + @SuppressWarnings("deprecation") + @Test + void testQuoteCharSameAsDelimiterThrowsException_Deprecated() { + assertThrows(IllegalArgumentException.class, () -> CSVFormat.DEFAULT.withQuote('!').withDelimiter('!')); } - @Test(expected = IllegalArgumentException.class) - public void testQuotePolicyNoneWithoutEscapeThrowsException() { - CSVFormat.newFormat('!').withQuoteMode(QuoteMode.NONE); + @Test + void testQuotedNullStringTracksQuoteCharacter() throws IOException { + final StringBuilder out = new StringBuilder(); + // @formatter:off + final Builder builder = CSVFormat.DEFAULT.builder(); + final CSVFormat format = builder + .setQuoteMode(QuoteMode.ALL) + .setNullString("NULL") + .get(); + // @formatter:on + format.print(null, out, true); + assertEquals("\"NULL\"", out.toString()); + // set + out.setLength(0); + builder.setQuote('\''); + builder.get().print(null, out, true); + assertEquals("'NULL'", out.toString()); + // reset + out.setLength(0); + builder.setQuote((Character) null); + builder.get().print(null, out, true); + assertEquals("\"NULL\"", out.toString()); + // reset, reverse setter order + out.setLength(0); + builder.setNullString(null).setQuote((Character) null).setNullString("NULL"); + builder.get().print(null, out, true); + assertEquals("\"NULL\"", out.toString()); } @Test - public void testRFC4180() { - assertEquals(null, RFC4180.getCommentMarker()); + void testQuoteModeNoneShouldReturnMeaningfulExceptionMessage() { + final Exception exception = assertThrows(IllegalArgumentException.class, () -> + // @formatter:off + CSVFormat.DEFAULT.builder() + .setHeader("Col1", "Col2", "Col3", "Col4") + .setQuoteMode(QuoteMode.NONE) + .get() + // @formatter:on + ); + final String actualMessage = exception.getMessage(); + final String expectedMessage = "Quote mode set to NONE but no escape character is set"; + assertEquals(expectedMessage, actualMessage); + } + + @Test + void testQuotePolicyNoneWithoutEscapeThrowsException() { + assertThrows(IllegalArgumentException.class, () -> CSVFormat.newFormat('!').builder().setQuoteMode(QuoteMode.NONE).get()); + } + + @SuppressWarnings("deprecation") + @Test + void testQuotePolicyNoneWithoutEscapeThrowsException_Deprecated() { + assertThrows(IllegalArgumentException.class, () -> CSVFormat.newFormat('!').withQuoteMode(QuoteMode.NONE)); + } + + @Test + void testRFC4180() { + assertNull(RFC4180.getCommentMarker()); assertEquals(',', RFC4180.getDelimiter()); - assertEquals(null, RFC4180.getEscapeCharacter()); + assertNull(RFC4180.getEscapeCharacter()); assertFalse(RFC4180.getIgnoreEmptyLines()); assertEquals(Character.valueOf('"'), RFC4180.getQuoteCharacter()); - assertEquals(null, RFC4180.getQuoteMode()); + assertNull(RFC4180.getQuoteMode()); assertEquals("\r\n", RFC4180.getRecordSeparator()); } @SuppressWarnings("boxing") // no need to worry about boxing here @Test - public void testSerialization() throws Exception { + void testSerialization() throws Exception { final ByteArrayOutputStream out = new ByteArrayOutputStream(); - try (final ObjectOutputStream oos = new ObjectOutputStream(out)) { + try (ObjectOutputStream oos = new ObjectOutputStream(out)) { oos.writeObject(CSVFormat.DEFAULT); oos.flush(); } @@ -336,51 +1098,267 @@ public void testSerialization() throws Exception { final CSVFormat format = (CSVFormat) in.readObject(); assertNotNull(format); - assertEquals("delimiter", CSVFormat.DEFAULT.getDelimiter(), format.getDelimiter()); - assertEquals("encapsulator", CSVFormat.DEFAULT.getQuoteCharacter(), format.getQuoteCharacter()); - assertEquals("comment start", CSVFormat.DEFAULT.getCommentMarker(), format.getCommentMarker()); - assertEquals("record separator", CSVFormat.DEFAULT.getRecordSeparator(), format.getRecordSeparator()); - assertEquals("escape", CSVFormat.DEFAULT.getEscapeCharacter(), format.getEscapeCharacter()); - assertEquals("trim", CSVFormat.DEFAULT.getIgnoreSurroundingSpaces(), format.getIgnoreSurroundingSpaces()); - assertEquals("empty lines", CSVFormat.DEFAULT.getIgnoreEmptyLines(), format.getIgnoreEmptyLines()); + assertEquals(CSVFormat.DEFAULT.getDelimiter(), format.getDelimiter(), "delimiter"); + assertEquals(CSVFormat.DEFAULT.getQuoteCharacter(), format.getQuoteCharacter(), "encapsulator"); + assertEquals(CSVFormat.DEFAULT.getCommentMarker(), format.getCommentMarker(), "comment start"); + assertEquals(CSVFormat.DEFAULT.getRecordSeparator(), format.getRecordSeparator(), "record separator"); + assertEquals(CSVFormat.DEFAULT.getEscapeCharacter(), format.getEscapeCharacter(), "escape"); + assertEquals(CSVFormat.DEFAULT.getIgnoreSurroundingSpaces(), format.getIgnoreSurroundingSpaces(), "trim"); + assertEquals(CSVFormat.DEFAULT.getIgnoreEmptyLines(), format.getIgnoreEmptyLines(), "empty lines"); + } + + @Test + void testToString() { + + final String string = CSVFormat.INFORMIX_UNLOAD.toString(); + + assertEquals("Delimiter=<|> Escape=<\\> QuoteChar=<\"> RecordSeparator=<\n> EmptyLines:ignored SkipHeaderRecord:false", string); + } @Test - public void testWithCommentStart() throws Exception { + void testToStringAndWithCommentMarkerTakingCharacter() { + + final CSVFormat.Predefined csvFormatPredefined = CSVFormat.Predefined.Default; + final CSVFormat csvFormat = csvFormatPredefined.getFormat(); + + assertNull(csvFormat.getEscapeCharacter()); + assertTrue(csvFormat.isQuoteCharacterSet()); + + assertFalse(csvFormat.getTrim()); + assertFalse(csvFormat.getIgnoreSurroundingSpaces()); + + assertFalse(csvFormat.getTrailingDelimiter()); + assertEquals(',', csvFormat.getDelimiter()); + + assertFalse(csvFormat.getIgnoreHeaderCase()); + assertEquals("\r\n", csvFormat.getRecordSeparator()); + + assertFalse(csvFormat.isCommentMarkerSet()); + assertNull(csvFormat.getCommentMarker()); + + assertFalse(csvFormat.isNullStringSet()); + assertFalse(csvFormat.getAllowMissingColumnNames()); + + assertFalse(csvFormat.isEscapeCharacterSet()); + assertFalse(csvFormat.getSkipHeaderRecord()); + + assertNull(csvFormat.getNullString()); + assertNull(csvFormat.getQuoteMode()); + + assertTrue(csvFormat.getIgnoreEmptyLines()); + assertEquals('\"', (char) csvFormat.getQuoteCharacter()); + + final Character character = Character.valueOf('n'); + + final CSVFormat csvFormatTwo = csvFormat.withCommentMarker(character); + + assertNull(csvFormat.getEscapeCharacter()); + assertTrue(csvFormat.isQuoteCharacterSet()); + + assertFalse(csvFormat.getTrim()); + assertFalse(csvFormat.getIgnoreSurroundingSpaces()); + + assertFalse(csvFormat.getTrailingDelimiter()); + assertEquals(',', csvFormat.getDelimiter()); + + assertFalse(csvFormat.getIgnoreHeaderCase()); + assertEquals("\r\n", csvFormat.getRecordSeparator()); + + assertFalse(csvFormat.isCommentMarkerSet()); + assertNull(csvFormat.getCommentMarker()); + + assertFalse(csvFormat.isNullStringSet()); + assertFalse(csvFormat.getAllowMissingColumnNames()); + + assertFalse(csvFormat.isEscapeCharacterSet()); + assertFalse(csvFormat.getSkipHeaderRecord()); + + assertNull(csvFormat.getNullString()); + assertNull(csvFormat.getQuoteMode()); + + assertTrue(csvFormat.getIgnoreEmptyLines()); + assertEquals('\"', (char) csvFormat.getQuoteCharacter()); + + assertFalse(csvFormatTwo.isNullStringSet()); + assertFalse(csvFormatTwo.getAllowMissingColumnNames()); + + assertEquals('\"', (char) csvFormatTwo.getQuoteCharacter()); + assertNull(csvFormatTwo.getNullString()); + + assertEquals(',', csvFormatTwo.getDelimiter()); + assertFalse(csvFormatTwo.getTrailingDelimiter()); + + assertTrue(csvFormatTwo.isCommentMarkerSet()); + assertFalse(csvFormatTwo.getIgnoreHeaderCase()); + + assertFalse(csvFormatTwo.getTrim()); + assertNull(csvFormatTwo.getEscapeCharacter()); + + assertTrue(csvFormatTwo.isQuoteCharacterSet()); + assertFalse(csvFormatTwo.getIgnoreSurroundingSpaces()); + + assertEquals("\r\n", csvFormatTwo.getRecordSeparator()); + assertNull(csvFormatTwo.getQuoteMode()); + + assertEquals('n', (char) csvFormatTwo.getCommentMarker()); + assertFalse(csvFormatTwo.getSkipHeaderRecord()); + + assertFalse(csvFormatTwo.isEscapeCharacterSet()); + assertTrue(csvFormatTwo.getIgnoreEmptyLines()); + + assertNotSame(csvFormat, csvFormatTwo); + assertNotSame(csvFormatTwo, csvFormat); + + assertNotEquals(csvFormatTwo, csvFormat); + + assertNull(csvFormat.getEscapeCharacter()); + assertTrue(csvFormat.isQuoteCharacterSet()); + + assertFalse(csvFormat.getTrim()); + assertFalse(csvFormat.getIgnoreSurroundingSpaces()); + + assertFalse(csvFormat.getTrailingDelimiter()); + assertEquals(',', csvFormat.getDelimiter()); + + assertFalse(csvFormat.getIgnoreHeaderCase()); + assertEquals("\r\n", csvFormat.getRecordSeparator()); + + assertFalse(csvFormat.isCommentMarkerSet()); + assertNull(csvFormat.getCommentMarker()); + + assertFalse(csvFormat.isNullStringSet()); + assertFalse(csvFormat.getAllowMissingColumnNames()); + + assertFalse(csvFormat.isEscapeCharacterSet()); + assertFalse(csvFormat.getSkipHeaderRecord()); + + assertNull(csvFormat.getNullString()); + assertNull(csvFormat.getQuoteMode()); + + assertTrue(csvFormat.getIgnoreEmptyLines()); + assertEquals('\"', (char) csvFormat.getQuoteCharacter()); + + assertFalse(csvFormatTwo.isNullStringSet()); + assertFalse(csvFormatTwo.getAllowMissingColumnNames()); + + assertEquals('\"', (char) csvFormatTwo.getQuoteCharacter()); + assertNull(csvFormatTwo.getNullString()); + + assertEquals(',', csvFormatTwo.getDelimiter()); + assertFalse(csvFormatTwo.getTrailingDelimiter()); + + assertTrue(csvFormatTwo.isCommentMarkerSet()); + assertFalse(csvFormatTwo.getIgnoreHeaderCase()); + + assertFalse(csvFormatTwo.getTrim()); + assertNull(csvFormatTwo.getEscapeCharacter()); + + assertTrue(csvFormatTwo.isQuoteCharacterSet()); + assertFalse(csvFormatTwo.getIgnoreSurroundingSpaces()); + + assertEquals("\r\n", csvFormatTwo.getRecordSeparator()); + assertNull(csvFormatTwo.getQuoteMode()); + + assertEquals('n', (char) csvFormatTwo.getCommentMarker()); + assertFalse(csvFormatTwo.getSkipHeaderRecord()); + + assertFalse(csvFormatTwo.isEscapeCharacterSet()); + assertTrue(csvFormatTwo.getIgnoreEmptyLines()); + + assertNotSame(csvFormat, csvFormatTwo); + assertNotSame(csvFormatTwo, csvFormat); + + assertNotEquals(csvFormat, csvFormatTwo); + + assertNotEquals(csvFormatTwo, csvFormat); + assertEquals("Delimiter=<,> QuoteChar=<\"> CommentStart= RecordSeparator=<\r\n> EmptyLines:ignored SkipHeaderRecord:false", + csvFormatTwo.toString()); + + } + + @Test + void testTrim() throws IOException { + final CSVFormat formatWithTrim = CSVFormat.DEFAULT.withDelimiter(',').withTrim().withQuote(null).withRecordSeparator(CRLF); + + CharSequence in = "a,b,c"; + final StringBuilder out = new StringBuilder(); + formatWithTrim.print(in, out, true); + assertEquals("a,b,c", out.toString()); + + in = new StringBuilder(" x,y,z"); + out.setLength(0); + formatWithTrim.print(in, out, true); + assertEquals("x,y,z", out.toString()); + + in = new StringBuilder(""); + out.setLength(0); + formatWithTrim.print(in, out, true); + assertEquals("", out.toString()); + + in = new StringBuilder("header\r\n"); + out.setLength(0); + formatWithTrim.print(in, out, true); + assertEquals("header", out.toString()); + } + + @Test + void testWithCommentStart() { final CSVFormat formatWithCommentStart = CSVFormat.DEFAULT.withCommentMarker('#'); - assertEquals( Character.valueOf('#'), formatWithCommentStart.getCommentMarker()); + assertEquals(Character.valueOf('#'), formatWithCommentStart.getCommentMarker()); } - @Test(expected = IllegalArgumentException.class) - public void testWithCommentStartCRThrowsException() { - CSVFormat.DEFAULT.withCommentMarker(CR); + @Test + void testWithCommentStartCRThrowsException() { + assertThrows(IllegalArgumentException.class, () -> CSVFormat.DEFAULT.withCommentMarker(CR)); } @Test - public void testWithDelimiter() throws Exception { + void testWithDelimiter() { final CSVFormat formatWithDelimiter = CSVFormat.DEFAULT.withDelimiter('!'); assertEquals('!', formatWithDelimiter.getDelimiter()); } - @Test(expected = IllegalArgumentException.class) - public void testWithDelimiterLFThrowsException() { - CSVFormat.DEFAULT.withDelimiter(LF); + @Test + void testWithDelimiterLFThrowsException() { + assertThrows(IllegalArgumentException.class, () -> CSVFormat.DEFAULT.withDelimiter(LF)); + } + + @Test + void testWithEmptyDuplicates() { + final CSVFormat formatWithEmptyDuplicates = CSVFormat.DEFAULT.builder().setDuplicateHeaderMode(DuplicateHeaderMode.ALLOW_EMPTY).get(); + + assertEquals(DuplicateHeaderMode.ALLOW_EMPTY, formatWithEmptyDuplicates.getDuplicateHeaderMode()); + assertFalse(formatWithEmptyDuplicates.getAllowDuplicateHeaderNames()); + } + + @Test + void testWithEmptyEnum() { + final CSVFormat formatWithHeader = CSVFormat.DEFAULT.withHeader(EmptyEnum.class); + assertEquals(0, formatWithHeader.getHeader().length); } @Test - public void testWithEscape() throws Exception { + void testWithEscape() { final CSVFormat formatWithEscape = CSVFormat.DEFAULT.withEscape('&'); assertEquals(Character.valueOf('&'), formatWithEscape.getEscapeCharacter()); } - @Test(expected = IllegalArgumentException.class) - public void testWithEscapeCRThrowsExceptions() { - CSVFormat.DEFAULT.withEscape(CR); + @Test + void testWithEscapeCRThrowsExceptions() { + assertThrows(IllegalArgumentException.class, () -> CSVFormat.DEFAULT.withEscape(CR)); + } + + @Test + void testWithFirstRecordAsHeader() { + final CSVFormat formatWithFirstRecordAsHeader = CSVFormat.DEFAULT.withFirstRecordAsHeader(); + assertTrue(formatWithFirstRecordAsHeader.getSkipHeaderRecord()); + assertEquals(0, formatWithFirstRecordAsHeader.getHeader().length); } @Test - public void testWithHeader() throws Exception { - final String[] header = new String[]{"one", "two", "three"}; + void testWithHeader() { + final String[] header = { "one", "two", "three" }; // withHeader() makes a copy of the header array. final CSVFormat formatWithHeader = CSVFormat.DEFAULT.withHeader(header); assertArrayEquals(header, formatWithHeader.getHeader()); @@ -388,81 +1366,242 @@ public void testWithHeader() throws Exception { } @Test - public void testWithHeaderEnum() throws Exception { + void testWithHeaderComments() { + + final CSVFormat csvFormat = CSVFormat.DEFAULT; + + assertEquals('\"', (char) csvFormat.getQuoteCharacter()); + assertFalse(csvFormat.isCommentMarkerSet()); + + assertFalse(csvFormat.isEscapeCharacterSet()); + assertTrue(csvFormat.isQuoteCharacterSet()); + + assertFalse(csvFormat.getSkipHeaderRecord()); + assertNull(csvFormat.getQuoteMode()); + + assertEquals(',', csvFormat.getDelimiter()); + assertTrue(csvFormat.getIgnoreEmptyLines()); + + assertFalse(csvFormat.getIgnoreHeaderCase()); + assertNull(csvFormat.getCommentMarker()); + + assertEquals("\r\n", csvFormat.getRecordSeparator()); + assertFalse(csvFormat.getTrailingDelimiter()); + + assertFalse(csvFormat.getAllowMissingColumnNames()); + assertFalse(csvFormat.getTrim()); + + assertFalse(csvFormat.isNullStringSet()); + assertNull(csvFormat.getNullString()); + + assertFalse(csvFormat.getIgnoreSurroundingSpaces()); + assertNull(csvFormat.getEscapeCharacter()); + + final Object[] objectArray = new Object[8]; + final CSVFormat csvFormatTwo = csvFormat.withHeaderComments(objectArray); + + assertEquals('\"', (char) csvFormat.getQuoteCharacter()); + assertFalse(csvFormat.isCommentMarkerSet()); + + assertFalse(csvFormat.isEscapeCharacterSet()); + assertTrue(csvFormat.isQuoteCharacterSet()); + + assertFalse(csvFormat.getSkipHeaderRecord()); + assertNull(csvFormat.getQuoteMode()); + + assertEquals(',', csvFormat.getDelimiter()); + assertTrue(csvFormat.getIgnoreEmptyLines()); + + assertFalse(csvFormat.getIgnoreHeaderCase()); + assertNull(csvFormat.getCommentMarker()); + + assertEquals("\r\n", csvFormat.getRecordSeparator()); + assertFalse(csvFormat.getTrailingDelimiter()); + + assertFalse(csvFormat.getAllowMissingColumnNames()); + assertFalse(csvFormat.getTrim()); + + assertFalse(csvFormat.isNullStringSet()); + assertNull(csvFormat.getNullString()); + + assertFalse(csvFormat.getIgnoreSurroundingSpaces()); + assertNull(csvFormat.getEscapeCharacter()); + + assertFalse(csvFormatTwo.getIgnoreHeaderCase()); + assertNull(csvFormatTwo.getQuoteMode()); + + assertTrue(csvFormatTwo.getIgnoreEmptyLines()); + assertFalse(csvFormatTwo.getIgnoreSurroundingSpaces()); + + assertNull(csvFormatTwo.getEscapeCharacter()); + assertFalse(csvFormatTwo.getTrim()); + + assertFalse(csvFormatTwo.isEscapeCharacterSet()); + assertTrue(csvFormatTwo.isQuoteCharacterSet()); + + assertFalse(csvFormatTwo.getSkipHeaderRecord()); + assertEquals('\"', (char) csvFormatTwo.getQuoteCharacter()); + + assertFalse(csvFormatTwo.getAllowMissingColumnNames()); + assertNull(csvFormatTwo.getNullString()); + + assertFalse(csvFormatTwo.isNullStringSet()); + assertFalse(csvFormatTwo.getTrailingDelimiter()); + + assertEquals("\r\n", csvFormatTwo.getRecordSeparator()); + assertEquals(',', csvFormatTwo.getDelimiter()); + + assertNull(csvFormatTwo.getCommentMarker()); + assertFalse(csvFormatTwo.isCommentMarkerSet()); + + assertNotSame(csvFormat, csvFormatTwo); + assertNotSame(csvFormatTwo, csvFormat); + + assertNotEquals(csvFormatTwo, csvFormat); // CSV-244 - should not be equal + + final String string = csvFormatTwo.format(objectArray); + + assertEquals('\"', (char) csvFormat.getQuoteCharacter()); + assertFalse(csvFormat.isCommentMarkerSet()); + + assertFalse(csvFormat.isEscapeCharacterSet()); + assertTrue(csvFormat.isQuoteCharacterSet()); + + assertFalse(csvFormat.getSkipHeaderRecord()); + assertNull(csvFormat.getQuoteMode()); + + assertEquals(',', csvFormat.getDelimiter()); + assertTrue(csvFormat.getIgnoreEmptyLines()); + + assertFalse(csvFormat.getIgnoreHeaderCase()); + assertNull(csvFormat.getCommentMarker()); + + assertEquals("\r\n", csvFormat.getRecordSeparator()); + assertFalse(csvFormat.getTrailingDelimiter()); + + assertFalse(csvFormat.getAllowMissingColumnNames()); + assertFalse(csvFormat.getTrim()); + + assertFalse(csvFormat.isNullStringSet()); + assertNull(csvFormat.getNullString()); + + assertFalse(csvFormat.getIgnoreSurroundingSpaces()); + assertNull(csvFormat.getEscapeCharacter()); + + assertFalse(csvFormatTwo.getIgnoreHeaderCase()); + assertNull(csvFormatTwo.getQuoteMode()); + + assertTrue(csvFormatTwo.getIgnoreEmptyLines()); + assertFalse(csvFormatTwo.getIgnoreSurroundingSpaces()); + + assertNull(csvFormatTwo.getEscapeCharacter()); + assertFalse(csvFormatTwo.getTrim()); + + assertFalse(csvFormatTwo.isEscapeCharacterSet()); + assertTrue(csvFormatTwo.isQuoteCharacterSet()); + + assertFalse(csvFormatTwo.getSkipHeaderRecord()); + assertEquals('\"', (char) csvFormatTwo.getQuoteCharacter()); + + assertFalse(csvFormatTwo.getAllowMissingColumnNames()); + assertNull(csvFormatTwo.getNullString()); + + assertFalse(csvFormatTwo.isNullStringSet()); + assertFalse(csvFormatTwo.getTrailingDelimiter()); + + assertEquals("\r\n", csvFormatTwo.getRecordSeparator()); + assertEquals(',', csvFormatTwo.getDelimiter()); + + assertNull(csvFormatTwo.getCommentMarker()); + assertFalse(csvFormatTwo.isCommentMarkerSet()); + + assertNotSame(csvFormat, csvFormatTwo); + assertNotSame(csvFormatTwo, csvFormat); + + assertNotNull(string); + assertNotEquals(csvFormat, csvFormatTwo); // CSV-244 - should not be equal + + assertNotEquals(csvFormatTwo, csvFormat); // CSV-244 - should not be equal + assertEquals(",,,,,,,", string); + + } + + @Test + void testWithHeaderEnum() { final CSVFormat formatWithHeader = CSVFormat.DEFAULT.withHeader(Header.class); - assertArrayEquals(new String[]{ "Name", "Email", "Phone" }, formatWithHeader.getHeader()); + assertArrayEquals(new String[] { "Name", "Email", "Phone" }, formatWithHeader.getHeader()); } @Test - public void testWithEmptyEnum() throws Exception { - final CSVFormat formatWithHeader = CSVFormat.DEFAULT.withHeader(EmptyEnum.class); - Assert.assertTrue(formatWithHeader.getHeader().length == 0); + void testWithHeaderEnumNull() { + final CSVFormat format = CSVFormat.DEFAULT; + final Class> simpleName = null; + format.withHeader(simpleName); + } + + @Test + void testWithHeaderResultSetNull() throws SQLException { + final CSVFormat format = CSVFormat.DEFAULT; + final ResultSet resultSet = null; + format.withHeader(resultSet); } @Test - public void testWithIgnoreEmptyLines() throws Exception { + void testWithIgnoreEmptyLines() { assertFalse(CSVFormat.DEFAULT.withIgnoreEmptyLines(false).getIgnoreEmptyLines()); assertTrue(CSVFormat.DEFAULT.withIgnoreEmptyLines().getIgnoreEmptyLines()); } @Test - public void testWithIgnoreSurround() throws Exception { + void testWithIgnoreSurround() { assertFalse(CSVFormat.DEFAULT.withIgnoreSurroundingSpaces(false).getIgnoreSurroundingSpaces()); assertTrue(CSVFormat.DEFAULT.withIgnoreSurroundingSpaces().getIgnoreSurroundingSpaces()); } @Test - public void testWithNullString() throws Exception { + void testWithNullString() { final CSVFormat formatWithNullString = CSVFormat.DEFAULT.withNullString("null"); assertEquals("null", formatWithNullString.getNullString()); } @Test - public void testWithQuoteChar() throws Exception { + void testWithQuoteChar() { final CSVFormat formatWithQuoteChar = CSVFormat.DEFAULT.withQuote('"'); assertEquals(Character.valueOf('"'), formatWithQuoteChar.getQuoteCharacter()); } - @Test(expected = IllegalArgumentException.class) - public void testWithQuoteLFThrowsException() { - CSVFormat.DEFAULT.withQuote(LF); + @Test + void testWithQuoteLFThrowsException() { + assertThrows(IllegalArgumentException.class, () -> CSVFormat.DEFAULT.withQuote(LF)); } @Test - public void testWithQuotePolicy() throws Exception { + void testWithQuotePolicy() { final CSVFormat formatWithQuotePolicy = CSVFormat.DEFAULT.withQuoteMode(QuoteMode.ALL); assertEquals(QuoteMode.ALL, formatWithQuotePolicy.getQuoteMode()); } @Test - public void testWithRecordSeparatorCR() throws Exception { + void testWithRecordSeparatorCR() { final CSVFormat formatWithRecordSeparator = CSVFormat.DEFAULT.withRecordSeparator(CR); assertEquals(String.valueOf(CR), formatWithRecordSeparator.getRecordSeparator()); } @Test - public void testWithRecordSeparatorLF() throws Exception { - final CSVFormat formatWithRecordSeparator = CSVFormat.DEFAULT.withRecordSeparator(LF); - assertEquals(String.valueOf(LF), formatWithRecordSeparator.getRecordSeparator()); - } - - @Test - public void testWithRecordSeparatorCRLF() throws Exception { + void testWithRecordSeparatorCRLF() { final CSVFormat formatWithRecordSeparator = CSVFormat.DEFAULT.withRecordSeparator(CRLF); assertEquals(CRLF, formatWithRecordSeparator.getRecordSeparator()); } @Test - public void testWithFirstRecordAsHeader() throws Exception { - final CSVFormat formatWithFirstRecordAsHeader = CSVFormat.DEFAULT.withFirstRecordAsHeader(); - assertTrue(formatWithFirstRecordAsHeader.getSkipHeaderRecord()); - assertTrue(formatWithFirstRecordAsHeader.getHeader().length == 0); - } - - public enum Header { - Name, Email, Phone + void testWithRecordSeparatorLF() { + final CSVFormat formatWithRecordSeparator = CSVFormat.DEFAULT.withRecordSeparator(LF); + assertEquals(String.valueOf(LF), formatWithRecordSeparator.getRecordSeparator()); } - public enum EmptyEnum { + @Test + void testWithSystemRecordSeparator() { + final CSVFormat formatWithRecordSeparator = CSVFormat.DEFAULT.withSystemRecordSeparator(); + assertEquals(System.lineSeparator(), formatWithRecordSeparator.getRecordSeparator()); } } diff --git a/src/test/java/org/apache/commons/csv/CSVParserTest.java b/src/test/java/org/apache/commons/csv/CSVParserTest.java index c547b0d94b..6d9bdd9e80 100644 --- a/src/test/java/org/apache/commons/csv/CSVParserTest.java +++ b/src/test/java/org/apache/commons/csv/CSVParserTest.java @@ -1,18 +1,20 @@ /* - * Licensed to the Apache Software Foundation (ASF) under one or more - * contributor license agreements. See the NOTICE file distributed with - * this work for additional information regarding copyright ownership. - * The ASF licenses this file to You under the Apache License, Version 2.0 - * (the "License"); you may not use this file except in compliance with - * the License. You may obtain a copy of the License at + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at * - * http://www.apache.org/licenses/LICENSE-2.0 + * https://www.apache.org/licenses/LICENSE-2.0 * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. */ package org.apache.commons.csv; @@ -20,236 +22,405 @@ import static org.apache.commons.csv.Constants.CR; import static org.apache.commons.csv.Constants.CRLF; import static org.apache.commons.csv.Constants.LF; -import static org.junit.Assert.assertArrayEquals; -import static org.junit.Assert.assertEquals; -import static org.junit.Assert.assertFalse; -import static org.junit.Assert.assertNotNull; -import static org.junit.Assert.assertNull; -import static org.junit.Assert.assertTrue; -import static org.junit.Assert.fail; +import static org.apache.commons.csv.CsvAssertions.assertValuesEquals; +import static org.junit.jupiter.api.Assertions.assertEquals; +import static org.junit.jupiter.api.Assertions.assertFalse; +import static org.junit.jupiter.api.Assertions.assertInstanceOf; +import static org.junit.jupiter.api.Assertions.assertNotNull; +import static org.junit.jupiter.api.Assertions.assertNull; +import static org.junit.jupiter.api.Assertions.assertThrows; +import static org.junit.jupiter.api.Assertions.assertTrue; import java.io.File; import java.io.IOException; +import java.io.InputStream; import java.io.InputStreamReader; import java.io.PipedReader; import java.io.PipedWriter; import java.io.Reader; import java.io.StringReader; import java.io.StringWriter; +import java.io.UncheckedIOException; import java.net.URL; import java.nio.charset.Charset; import java.nio.charset.StandardCharsets; +import java.nio.file.Files; +import java.nio.file.Path; +import java.nio.file.Paths; import java.util.ArrayList; +import java.util.Arrays; import java.util.Iterator; import java.util.List; import java.util.Map; import java.util.NoSuchElementException; +import java.util.stream.Collectors; +import java.util.stream.Stream; import org.apache.commons.io.input.BOMInputStream; -import org.junit.Assert; -import org.junit.Ignore; -import org.junit.Test; +import org.apache.commons.io.input.BrokenInputStream; +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Disabled; +import org.junit.jupiter.api.Test; +import org.junit.jupiter.params.ParameterizedTest; +import org.junit.jupiter.params.provider.EnumSource; +import org.junit.jupiter.params.provider.ValueSource; /** - * CSVParserTest + * Tests {@link CSVParser}. * - * The test are organized in three different sections: The 'setter/getter' section, the lexer section and finally the - * parser section. In case a test fails, you should follow a top-down approach for fixing a potential bug (its likely - * that the parser itself fails if the lexer has problems...). - * - * @version $Id$ + * The test are organized in three different sections: The 'setter/getter' section, the lexer section and finally the parser section. In case a test fails, you + * should follow a top-down approach for fixing a potential bug (its likely that the parser itself fails if the lexer has problems...). */ -public class CSVParserTest { +class CSVParserTest { + + private static final CSVFormat EXCEL_WITH_HEADER = CSVFormat.EXCEL.withHeader(); private static final Charset UTF_8 = StandardCharsets.UTF_8; private static final String UTF_8_NAME = UTF_8.name(); - private static final String CSV_INPUT = "a,b,c,d\n" + " a , b , 1 2 \n" + "\"foo baar\", b,\n" + // @formatter:off + private static final String CSV_INPUT = "a,b,c,d\n" + + " a , b , 1 2 \n" + + "\"foo baar\", b,\n" + // + " \"foo\n,,\n\"\",,\n\\\"\",d,e\n"; - + " \"foo\n,,\n\"\",,\n\"\"\",d,e\n"; // changed to use standard CSV escaping + " \"foo\n,,\n\"\",,\n\"\"\",d,e\n"; // changed to use standard CSV escaping + // @formatter:on private static final String CSV_INPUT_1 = "a,b,c,d"; private static final String CSV_INPUT_2 = "a,b,1 2"; - private static final String[][] RESULT = { { "a", "b", "c", "d" }, { "a", "b", "1 2" }, { "foo baar", "b", "" }, - { "foo\n,,\n\",,\n\"", "d", "e" } }; + private static final String[][] RESULT = { { "a", "b", "c", "d" }, { "a", "b", "1 2" }, { "foo baar", "b", "" }, { "foo\n,,\n\",,\n\"", "d", "e" } }; + + // CSV with no header comments + private static final String CSV_INPUT_NO_COMMENT = "A,B" + CRLF + "1,2" + CRLF; + + // CSV with a header comment + private static final String CSV_INPUT_HEADER_COMMENT = "# header comment" + CRLF + "A,B" + CRLF + "1,2" + CRLF; + + // CSV with a single line header and trailer comment + private static final String CSV_INPUT_HEADER_TRAILER_COMMENT = "# header comment" + CRLF + "A,B" + CRLF + "1,2" + CRLF + "# comment"; + + // CSV with a multi-line header and trailer comment + private static final String CSV_INPUT_MULTILINE_HEADER_TRAILER_COMMENT = "# multi-line" + CRLF + "# header comment" + CRLF + "A,B" + CRLF + "1,2" + CRLF + + "# multi-line" + CRLF + "# comment"; + + // Format with auto-detected header + private static final CSVFormat FORMAT_AUTO_HEADER = CSVFormat.Builder.create(CSVFormat.DEFAULT).setCommentMarker('#').setHeader().get(); + + // Format with explicit header + // @formatter:off + private static final CSVFormat FORMAT_EXPLICIT_HEADER = CSVFormat.Builder.create(CSVFormat.DEFAULT) + .setSkipHeaderRecord(true) + .setCommentMarker('#') + .setHeader("A", "B") + .get(); + // @formatter:on - private BOMInputStream createBOMInputStream(String resource) throws IOException { - final URL url = ClassLoader.getSystemClassLoader().getResource(resource); - return new BOMInputStream(url.openStream()); + // Format with explicit header that does not skip the header line + // @formatter:off + CSVFormat FORMAT_EXPLICIT_HEADER_NOSKIP = CSVFormat.Builder.create(CSVFormat.DEFAULT) + .setCommentMarker('#') + .setHeader("A", "B") + .get(); + // @formatter:on + + @SuppressWarnings("resource") // caller releases + private BOMInputStream createBOMInputStream(final String resource) throws IOException { + return new BOMInputStream(ClassLoader.getSystemClassLoader().getResource(resource).openStream()); + } + + CSVRecord parse(final CSVParser parser, final int failParseRecordNo) throws IOException { + if (parser.getRecordNumber() + 1 == failParseRecordNo) { + assertThrows(IOException.class, () -> parser.nextRecord()); + return null; + } + return parser.nextRecord(); + } + + private void parseFully(final CSVParser parser) { + parser.forEach(Assertions::assertNotNull); } - - @Test - public void testBackslashEscaping() throws IOException { + @Test + void testBackslashEscaping() throws IOException { // To avoid confusion over the need for escaping chars in java code, // We will test with a forward slash as the escape char, and a single // quote as the encapsulator. - final String code = "one,two,three\n" // 0 - + "'',''\n" // 1) empty encapsulators - + "/',/'\n" // 2) single encapsulators - + "'/'','/''\n" // 3) single encapsulators encapsulated via escape - + "'''',''''\n" // 4) single encapsulators encapsulated via doubling - + "/,,/,\n" // 5) separator escaped - + "//,//\n" // 6) escape escaped - + "'//','//'\n" // 7) escape escaped in encapsulation - + " 8 , \"quoted \"\" /\" // string\" \n" // don't eat spaces - + "9, /\n \n" // escaped newline - + ""; - final String[][] res = { { "one", "two", "three" }, // 0 - { "", "" }, // 1 - { "'", "'" }, // 2 - { "'", "'" }, // 3 - { "'", "'" }, // 4 - { ",", "," }, // 5 - { "/", "/" }, // 6 - { "/", "/" }, // 7 - { " 8 ", " \"quoted \"\" /\" / string\" " }, { "9", " \n " }, }; - - final CSVFormat format = CSVFormat.newFormat(',').withQuote('\'').withRecordSeparator(CRLF).withEscape('/') - .withIgnoreEmptyLines(); - - try (final CSVParser parser = CSVParser.parse(code, format)) { + // @formatter:off + final String code = "one,two,three\n" + // 0 + "'',''\n" + // 1) empty encapsulators + "/',/'\n" + // 2) single encapsulators + "'/'','/''\n" + // 3) single encapsulators encapsulated via escape + "'''',''''\n" + // 4) single encapsulators encapsulated via doubling + "/,,/,\n" + // 5) separator escaped + "//,//\n" + // 6) escape escaped + "'//','//'\n" + // 7) escape escaped in encapsulation + " 8 , \"quoted \"\" /\" // string\" \n" + // don't eat spaces + "9, /\n \n" + // escaped newline + ""; + final String[][] res = {{"one", "two", "three"}, // 0 + {"", ""}, // 1 + {"'", "'"}, // 2 + {"'", "'"}, // 3 + {"'", "'"}, // 4 + {",", ","}, // 5 + {"/", "/"}, // 6 + {"/", "/"}, // 7 + {" 8 ", " \"quoted \"\" /\" / string\" "}, {"9", " \n "} }; + // @formatter:on + final CSVFormat format = CSVFormat.newFormat(',').withQuote('\'').withRecordSeparator(CRLF).withEscape('/').withIgnoreEmptyLines(); + try (CSVParser parser = CSVParser.parse(code, format)) { final List records = parser.getRecords(); - assertTrue(records.size() > 0); - - Utils.compare("Records do not match expected result", res, records); + assertFalse(records.isEmpty()); + Utils.compare("Records do not match expected result", res, records, -1); } } @Test - public void testBackslashEscaping2() throws IOException { - + void testBackslashEscaping2() throws IOException { // To avoid confusion over the need for escaping chars in java code, // We will test with a forward slash as the escape char, and a single // quote as the encapsulator. - - final String code = "" + " , , \n" // 1) - + " \t , , \n" // 2) - + " // , /, , /,\n" // 3) - + ""; - final String[][] res = { { " ", " ", " " }, // 1 - { " \t ", " ", " " }, // 2 - { " / ", " , ", " ," }, // 3 + // @formatter:off + final String code = " , , \n" + // 1) + " \t , , \n" + // 2) + " // , /, , /,\n" + // 3) + ""; + final String[][] res = {{" ", " ", " "}, // 1 + {" \t ", " ", " "}, // 2 + {" / ", " , ", " ,"}, // 3 }; - - final CSVFormat format = CSVFormat.newFormat(',').withRecordSeparator(CRLF).withEscape('/') - .withIgnoreEmptyLines(); - - try (final CSVParser parser = CSVParser.parse(code, format)) { + // @formatter:on + final CSVFormat format = CSVFormat.newFormat(',').withRecordSeparator(CRLF).withEscape('/').withIgnoreEmptyLines(); + try (CSVParser parser = CSVParser.parse(code, format)) { final List records = parser.getRecords(); - assertTrue(records.size() > 0); - - Utils.compare("", res, records); + assertFalse(records.isEmpty()); + Utils.compare("", res, records, -1); } } @Test - @Ignore - public void testBackslashEscapingOld() throws IOException { - final String code = "one,two,three\n" + "on\\\"e,two\n" + "on\"e,two\n" + "one,\"tw\\\"o\"\n" + - "one,\"t\\,wo\"\n" + "one,two,\"th,ree\"\n" + "\"a\\\\\"\n" + "a\\,b\n" + "\"a\\\\,b\""; - final String[][] res = { { "one", "two", "three" }, { "on\\\"e", "two" }, { "on\"e", "two" }, - { "one", "tw\"o" }, { "one", "t\\,wo" }, // backslash in quotes only escapes a delimiter (",") + @Disabled + void testBackslashEscapingOld() throws IOException { + // @formatter:off + final String code = "one,two,three\n" + + "on\\\"e,two\n" + + "on\"e,two\n" + + "one,\"tw\\\"o\"\n" + + "one,\"t\\,wo\"\n" + + "one,two,\"th,ree\"\n" + + "\"a\\\\\"\n" + + "a\\,b\n" + + "\"a\\\\,b\""; + // @formatter:on + final String[][] res = { { "one", "two", "three" }, { "on\\\"e", "two" }, { "on\"e", "two" }, { "one", "tw\"o" }, { "one", "t\\,wo" }, // backslash in + // quotes only + // escapes a + // delimiter + // (",") { "one", "two", "th,ree" }, { "a\\\\" }, // backslash in quotes only escapes a delimiter (",") - { "a\\", "b" }, // a backslash must be returnd + { "a\\", "b" }, // a backslash must be returned { "a\\\\,b" } // backslash in quotes only escapes a delimiter (",") }; - try (final CSVParser parser = CSVParser.parse(code, CSVFormat.DEFAULT)) { + try (CSVParser parser = CSVParser.parse(code, CSVFormat.DEFAULT)) { final List records = parser.getRecords(); assertEquals(res.length, records.size()); - assertTrue(records.size() > 0); + assertFalse(records.isEmpty()); for (int i = 0; i < res.length; i++) { - assertArrayEquals(res[i], records.get(i).values()); + assertValuesEquals(res[i], records.get(i)); } } } @Test - @Ignore("CSV-107") - public void testBOM() throws IOException { - final URL url = ClassLoader.getSystemClassLoader().getResource("CSVFileParser/bom.csv"); - try (final CSVParser parser = CSVParser.parse(url, Charset.forName(UTF_8_NAME), CSVFormat.EXCEL.withHeader())) { - for (final CSVRecord record : parser) { - final String string = record.get("Date"); - Assert.assertNotNull(string); - // System.out.println("date: " + record.get("Date")); - } + @Disabled("CSV-107") + void testBOM() throws IOException { + final URL url = ClassLoader.getSystemClassLoader().getResource("org/apache/commons/csv/CSVFileParser/bom.csv"); + try (CSVParser parser = CSVParser.parse(url, StandardCharsets.UTF_8, EXCEL_WITH_HEADER)) { + parser.forEach(record -> assertNotNull(record.get("Date"))); } } @Test - public void testBOMInputStream_ParserWithReader() throws IOException { - try (final Reader reader = new InputStreamReader(createBOMInputStream("CSVFileParser/bom.csv"), UTF_8_NAME); - final CSVParser parser = new CSVParser(reader, CSVFormat.EXCEL.withHeader())) { - for (final CSVRecord record : parser) { - final String string = record.get("Date"); - Assert.assertNotNull(string); - // System.out.println("date: " + record.get("Date")); - } + void testBOMInputStreamParserWithInputStream() throws IOException { + try (BOMInputStream inputStream = createBOMInputStream("org/apache/commons/csv/CSVFileParser/bom.csv"); + CSVParser parser = CSVParser.parse(inputStream, UTF_8, EXCEL_WITH_HEADER)) { + parser.forEach(record -> assertNotNull(record.get("Date"))); } } @Test - public void testBOMInputStream_parseWithReader() throws IOException { - try (final Reader reader = new InputStreamReader(createBOMInputStream("CSVFileParser/bom.csv"), UTF_8_NAME); - final CSVParser parser = CSVParser.parse(reader, CSVFormat.EXCEL.withHeader())) { - for (final CSVRecord record : parser) { - final String string = record.get("Date"); - Assert.assertNotNull(string); - // System.out.println("date: " + record.get("Date")); - } + void testBOMInputStreamParserWithReader() throws IOException { + try (Reader reader = new InputStreamReader(createBOMInputStream("org/apache/commons/csv/CSVFileParser/bom.csv"), UTF_8_NAME); + CSVParser parser = CSVParser.builder() + .setReader(reader) + .setFormat(EXCEL_WITH_HEADER) + .get()) { + parser.forEach(record -> assertNotNull(record.get("Date"))); } } @Test - public void testBOMInputStream_ParserWithInputStream() throws IOException { - try (final BOMInputStream inputStream = createBOMInputStream("CSVFileParser/bom.csv"); - final CSVParser parser = CSVParser.parse(inputStream, UTF_8, CSVFormat.EXCEL.withHeader())) { - for (final CSVRecord record : parser) { - final String string = record.get("Date"); - Assert.assertNotNull(string); - // System.out.println("date: " + record.get("Date")); - } + void testBOMInputStreamParseWithReader() throws IOException { + try (Reader reader = new InputStreamReader(createBOMInputStream("org/apache/commons/csv/CSVFileParser/bom.csv"), UTF_8_NAME); + CSVParser parser = CSVParser.builder() + .setReader(reader) + .setFormat(EXCEL_WITH_HEADER) + .get()) { + parser.forEach(record -> assertNotNull(record.get("Date"))); } } @Test - public void testCarriageReturnEndings() throws IOException { - final String code = "foo\rbaar,\rhello,world\r,kanu"; - try (final CSVParser parser = CSVParser.parse(code, CSVFormat.DEFAULT)) { + void testCarriageReturnEndings() throws IOException { + final String string = "foo\rbaar,\rhello,world\r,kanu"; + try (CSVParser parser = CSVParser.builder().setCharSequence(string).get()) { final List records = parser.getRecords(); assertEquals(4, records.size()); } } @Test - public void testCarriageReturnLineFeedEndings() throws IOException { - final String code = "foo\r\nbaar,\r\nhello,world\r\n,kanu"; - try (final CSVParser parser = CSVParser.parse(code, CSVFormat.DEFAULT)) { + void testCarriageReturnLineFeedEndings() throws IOException { + final String string = "foo\r\nbaar,\r\nhello,world\r\n,kanu"; + try (CSVParser parser = CSVParser.builder().setCharSequence(string).get()) { final List records = parser.getRecords(); assertEquals(4, records.size()); } } - @Test(expected = NoSuchElementException.class) - public void testClose() throws Exception { + @Test + void testClose() throws Exception { final Reader in = new StringReader("# comment\na,b,c\n1,2,3\nx,y,z"); final Iterator records; - try (final CSVParser parser = CSVFormat.DEFAULT.withCommentMarker('#').withHeader().parse(in)) { + try (CSVParser parser = CSVFormat.DEFAULT.withCommentMarker('#').withHeader().parse(in)) { records = parser.iterator(); assertTrue(records.hasNext()); } assertFalse(records.hasNext()); - records.next(); + assertThrows(NoSuchElementException.class, records::next); + } + + @Test + void testCSV141CSVFormat_DEFAULT() throws Exception { + testCSV141Failure(CSVFormat.DEFAULT, 3); + } + + @Test + void testCSV141CSVFormat_INFORMIX_UNLOAD() throws Exception { + testCSV141Failure(CSVFormat.INFORMIX_UNLOAD, 1); + } + + @Test + void testCSV141CSVFormat_INFORMIX_UNLOAD_CSV() throws Exception { + testCSV141Failure(CSVFormat.INFORMIX_UNLOAD_CSV, 3); + } + + @Test + void testCSV141CSVFormat_ORACLE() throws Exception { + testCSV141Failure(CSVFormat.ORACLE, 2); + } + + @Test + void testCSV141CSVFormat_POSTGRESQL_CSV() throws Exception { + testCSV141Failure(CSVFormat.POSTGRESQL_CSV, 3); + } + + @Test + void testCSV141Excel() throws Exception { + testCSV141Ok(CSVFormat.EXCEL); + } + + private void testCSV141Failure(final CSVFormat format, final int failParseRecordNo) throws IOException { + final Path path = Paths.get("src/test/resources/org/apache/commons/csv/CSV-141/csv-141.csv"); + try (CSVParser parser = CSVParser.parse(path, StandardCharsets.UTF_8, format)) { + // row 1 + CSVRecord record = parse(parser, failParseRecordNo); + if (record == null) { + return; // expected failure + } + assertEquals("1414770317901", record.get(0)); + assertEquals("android.widget.EditText", record.get(1)); + assertEquals("pass sem1 _84*|*", record.get(2)); + assertEquals("0", record.get(3)); + assertEquals("pass sem1 _8", record.get(4)); + assertEquals(5, record.size()); + // row 2 + record = parse(parser, failParseRecordNo); + if (record == null) { + return; // expected failure + } + assertEquals("1414770318470", record.get(0)); + assertEquals("android.widget.EditText", record.get(1)); + assertEquals("pass sem1 _84:|", record.get(2)); + assertEquals("0", record.get(3)); + assertEquals("pass sem1 _84:\\", record.get(4)); + assertEquals(5, record.size()); + // row 3: Fail for certain + assertThrows(IOException.class, () -> parser.nextRecord()); + } + } + + private void testCSV141Ok(final CSVFormat format) throws IOException { + final Path path = Paths.get("src/test/resources/org/apache/commons/csv/CSV-141/csv-141.csv"); + try (CSVParser parser = CSVParser.parse(path, StandardCharsets.UTF_8, format)) { + // row 1 + CSVRecord record = parser.nextRecord(); + assertEquals("1414770317901", record.get(0)); + assertEquals("android.widget.EditText", record.get(1)); + assertEquals("pass sem1 _84*|*", record.get(2)); + assertEquals("0", record.get(3)); + assertEquals("pass sem1 _8", record.get(4)); + assertEquals(5, record.size()); + // row 2 + record = parser.nextRecord(); + assertEquals("1414770318470", record.get(0)); + assertEquals("android.widget.EditText", record.get(1)); + assertEquals("pass sem1 _84:|", record.get(2)); + assertEquals("0", record.get(3)); + assertEquals("pass sem1 _84:\\", record.get(4)); + assertEquals(5, record.size()); + // row 3 + record = parser.nextRecord(); + assertEquals("1414770318327", record.get(0)); + assertEquals("android.widget.EditText", record.get(1)); + assertEquals("pass sem1\n1414770318628\"", record.get(2)); + assertEquals("android.widget.EditText", record.get(3)); + assertEquals("pass sem1 _84*|*", record.get(4)); + assertEquals("0", record.get(5)); + assertEquals("pass sem1\n", record.get(6)); + assertEquals(7, record.size()); + // EOF + record = parser.nextRecord(); + assertNull(record); + } + } + + @Test + void testCSV141RFC4180() throws Exception { + testCSV141Failure(CSVFormat.RFC4180, 3); + } + + @Test + void testCSV235() throws IOException { + final String dqString = "\"aaa\",\"b\"\"bb\",\"ccc\""; // "aaa","b""bb","ccc" + try (CSVParser parser = CSVFormat.RFC4180.parse(new StringReader(dqString))) { + final Iterator records = parser.iterator(); + final CSVRecord record = records.next(); + assertFalse(records.hasNext()); + assertEquals(3, record.size()); + assertEquals("aaa", record.get(0)); + assertEquals("b\"bb", record.get(1)); + assertEquals("ccc", record.get(2)); + } } @Test - public void testCSV57() throws Exception { - try (final CSVParser parser = CSVParser.parse("", CSVFormat.DEFAULT)) { + void testCSV57() throws Exception { + try (CSVParser parser = CSVParser.parse("", CSVFormat.DEFAULT)) { final List list = parser.getRecords(); assertNotNull(list); assertEquals(0, list.size()); @@ -257,144 +428,187 @@ public void testCSV57() throws Exception { } @Test - public void testDefaultFormat() throws IOException { - final String code = "" + "a,b#\n" // 1) - + "\"\n\",\" \",#\n" // 2) - + "#,\"\"\n" // 3) - + "# Final comment\n"// 4) - ; + void testDefaultFormat() throws IOException { + // @formatter:off + final String code = "a,b#\n" + // 1) + "\"\n\",\" \",#\n" + // 2) + "#,\"\"\n" + // 3) + "# Final comment\n" // 4) + ; + // @formatter:on final String[][] res = { { "a", "b#" }, { "\n", " ", "#" }, { "#", "" }, { "# Final comment" } }; - CSVFormat format = CSVFormat.DEFAULT; assertFalse(format.isCommentMarkerSet()); - final String[][] res_comments = { { "a", "b#" }, { "\n", " ", "#" }, }; - - try (final CSVParser parser = CSVParser.parse(code, format)) { + final String[][] resComments = { { "a", "b#" }, { "\n", " ", "#" } }; + try (CSVParser parser = CSVParser.parse(code, format)) { final List records = parser.getRecords(); - assertTrue(records.size() > 0); - - Utils.compare("Failed to parse without comments", res, records); - + assertFalse(records.isEmpty()); + Utils.compare("Failed to parse without comments", res, records, -1); format = CSVFormat.DEFAULT.withCommentMarker('#'); } - try (final CSVParser parser = CSVParser.parse(code, format)) { + try (CSVParser parser = CSVParser.parse(code, format)) { final List records = parser.getRecords(); + Utils.compare("Failed to parse with comments", resComments, records, -1); + } + } + + @Test + void testDuplicateHeadersAllowedByDefault() throws Exception { + try (CSVParser parser = CSVParser.parse("a,b,a\n1,2,3\nx,y,z", CSVFormat.DEFAULT.withHeader())) { + // noop + } + } + + @Test + void testDuplicateHeadersNotAllowed() { + assertThrows(IllegalArgumentException.class, + () -> CSVParser.parse("a,b,a\n1,2,3\nx,y,z", CSVFormat.DEFAULT.withHeader().withAllowDuplicateHeaderNames(false))); + } - Utils.compare("Failed to parse with comments", res_comments, records); + /** + * With {@code ignoreSurroundingSpaces} enabled and a multi-character delimiter whose first character is whitespace, + * the empty field at the delimiter boundary must survive. The delimiter look-ahead is consumed while skipping + * leading whitespace, so re-evaluating it would drop the empty field and merge the following field's value. + */ + @Test + void testEmptyFieldBeforeWhitespacePrefixedMultiCharacterDelimiter() throws IOException { + final CSVFormat format = CSVFormat.DEFAULT.builder().setDelimiter(" |").setIgnoreSurroundingSpaces(true).get(); + try (CSVParser parser = CSVParser.parse(" |a", format)) { + final List records = parser.getRecords(); + assertEquals(1, records.size()); + assertValuesEquals(new String[] { "", "a" }, records.get(0)); + } + try (CSVParser parser = CSVParser.parse("a | |b", format)) { + final List records = parser.getRecords(); + assertEquals(1, records.size()); + assertValuesEquals(new String[] { "a", "", "b" }, records.get(0)); + } + try (CSVParser parser = CSVParser.parse("a | |b |", format)) { + final List records = parser.getRecords(); + assertEquals(1, records.size()); + assertValuesEquals(new String[] { "a", "", "b", "" }, records.get(0)); } } - @Test(expected = IllegalArgumentException.class) - public void testDuplicateHeaders() throws Exception { - CSVParser.parse("a,b,a\n1,2,3\nx,y,z", CSVFormat.DEFAULT.withHeader(new String[] {})); + @Test + void testEmptyFile() throws Exception { + try (CSVParser parser = CSVParser.parse(Paths.get("src/test/resources/org/apache/commons/csv/empty.txt"), StandardCharsets.UTF_8, + CSVFormat.DEFAULT)) { + assertNull(parser.nextRecord()); + } } @Test - public void testEmptyFile() throws Exception { - try (final CSVParser parser = CSVParser.parse("", CSVFormat.DEFAULT)) { + void testEmptyFileHeaderParsing() throws Exception { + try (CSVParser parser = CSVParser.parse("", CSVFormat.DEFAULT.withFirstRecordAsHeader())) { assertNull(parser.nextRecord()); + assertTrue(parser.getHeaderNames().isEmpty()); } } @Test - public void testEmptyLineBehaviourCSV() throws Exception { + void testEmptyLineBehaviorCSV() throws Exception { final String[] codes = { "hello,\r\n\r\n\r\n", "hello,\n\n\n", "hello,\"\"\r\n\r\n\r\n", "hello,\"\"\n\n\n" }; final String[][] res = { { "hello", "" } // CSV format ignores empty lines }; for (final String code : codes) { - try (final CSVParser parser = CSVParser.parse(code, CSVFormat.DEFAULT)) { + try (CSVParser parser = CSVParser.parse(code, CSVFormat.DEFAULT)) { final List records = parser.getRecords(); assertEquals(res.length, records.size()); - assertTrue(records.size() > 0); + assertFalse(records.isEmpty()); for (int i = 0; i < res.length; i++) { - assertArrayEquals(res[i], records.get(i).values()); + assertValuesEquals(res[i], records.get(i)); } } } } @Test - public void testEmptyLineBehaviourExcel() throws Exception { + void testEmptyLineBehaviorExcel() throws Exception { final String[] codes = { "hello,\r\n\r\n\r\n", "hello,\n\n\n", "hello,\"\"\r\n\r\n\r\n", "hello,\"\"\n\n\n" }; final String[][] res = { { "hello", "" }, { "" }, // Excel format does not ignore empty lines { "" } }; for (final String code : codes) { - try (final CSVParser parser = CSVParser.parse(code, CSVFormat.EXCEL)) { + try (CSVParser parser = CSVParser.parse(code, CSVFormat.EXCEL)) { final List records = parser.getRecords(); assertEquals(res.length, records.size()); - assertTrue(records.size() > 0); + assertFalse(records.isEmpty()); for (int i = 0; i < res.length; i++) { - assertArrayEquals(res[i], records.get(i).values()); + assertValuesEquals(res[i], records.get(i)); } } } } @Test - public void testEndOfFileBehaviorCSV() throws Exception { - final String[] codes = { "hello,\r\n\r\nworld,\r\n", "hello,\r\n\r\nworld,", "hello,\r\n\r\nworld,\"\"\r\n", - "hello,\r\n\r\nworld,\"\"", "hello,\r\n\r\nworld,\n", "hello,\r\n\r\nworld,", - "hello,\r\n\r\nworld,\"\"\n", "hello,\r\n\r\nworld,\"\"" }; + void testEmptyString() throws Exception { + try (CSVParser parser = CSVParser.parse("", CSVFormat.DEFAULT)) { + assertNull(parser.nextRecord()); + } + } + + @Test + void testEndOfFileBehaviorCSV() throws Exception { + final String[] codes = { "hello,\r\n\r\nworld,\r\n", "hello,\r\n\r\nworld,", "hello,\r\n\r\nworld,\"\"\r\n", "hello,\r\n\r\nworld,\"\"", + "hello,\r\n\r\nworld,\n", "hello,\r\n\r\nworld,", "hello,\r\n\r\nworld,\"\"\n", "hello,\r\n\r\nworld,\"\"" }; final String[][] res = { { "hello", "" }, // CSV format ignores empty lines { "world", "" } }; for (final String code : codes) { - try (final CSVParser parser = CSVParser.parse(code, CSVFormat.DEFAULT)) { + try (CSVParser parser = CSVParser.parse(code, CSVFormat.DEFAULT)) { final List records = parser.getRecords(); assertEquals(res.length, records.size()); - assertTrue(records.size() > 0); + assertFalse(records.isEmpty()); for (int i = 0; i < res.length; i++) { - assertArrayEquals(res[i], records.get(i).values()); + assertValuesEquals(res[i], records.get(i)); } } } } @Test - public void testEndOfFileBehaviourExcel() throws Exception { - final String[] codes = { "hello,\r\n\r\nworld,\r\n", "hello,\r\n\r\nworld,", "hello,\r\n\r\nworld,\"\"\r\n", - "hello,\r\n\r\nworld,\"\"", "hello,\r\n\r\nworld,\n", "hello,\r\n\r\nworld,", - "hello,\r\n\r\nworld,\"\"\n", "hello,\r\n\r\nworld,\"\"" }; + void testEndOfFileBehaviorExcel() throws Exception { + final String[] codes = { "hello,\r\n\r\nworld,\r\n", "hello,\r\n\r\nworld,", "hello,\r\n\r\nworld,\"\"\r\n", "hello,\r\n\r\nworld,\"\"", + "hello,\r\n\r\nworld,\n", "hello,\r\n\r\nworld,", "hello,\r\n\r\nworld,\"\"\n", "hello,\r\n\r\nworld,\"\"" }; final String[][] res = { { "hello", "" }, { "" }, // Excel format does not ignore empty lines { "world", "" } }; for (final String code : codes) { - try (final CSVParser parser = CSVParser.parse(code, CSVFormat.EXCEL)) { + try (CSVParser parser = CSVParser.parse(code, CSVFormat.EXCEL)) { final List records = parser.getRecords(); assertEquals(res.length, records.size()); - assertTrue(records.size() > 0); + assertFalse(records.isEmpty()); for (int i = 0; i < res.length; i++) { - assertArrayEquals(res[i], records.get(i).values()); + assertValuesEquals(res[i], records.get(i)); } } } } @Test - public void testExcelFormat1() throws IOException { - final String code = "value1,value2,value3,value4\r\na,b,c,d\r\n x,,," + - "\r\n\r\n\"\"\"hello\"\"\",\" \"\"world\"\"\",\"abc\ndef\",\r\n"; - final String[][] res = { { "value1", "value2", "value3", "value4" }, { "a", "b", "c", "d" }, - { " x", "", "", "" }, { "" }, { "\"hello\"", " \"world\"", "abc\ndef", "" } }; - try (final CSVParser parser = CSVParser.parse(code, CSVFormat.EXCEL)) { + void testExcelFormat1() throws IOException { + final String code = "value1,value2,value3,value4\r\na,b,c,d\r\n x,,,\r\n\r\n\"\"\"hello\"\"\",\" \"\"world\"\"\",\"abc\ndef\",\r\n"; + final String[][] res = { { "value1", "value2", "value3", "value4" }, { "a", "b", "c", "d" }, { " x", "", "", "" }, { "" }, + { "\"hello\"", " \"world\"", "abc\ndef", "" } }; + try (CSVParser parser = CSVParser.parse(code, CSVFormat.EXCEL)) { final List records = parser.getRecords(); assertEquals(res.length, records.size()); - assertTrue(records.size() > 0); + assertFalse(records.isEmpty()); for (int i = 0; i < res.length; i++) { - assertArrayEquals(res[i], records.get(i).values()); + assertValuesEquals(res[i], records.get(i)); } } } @Test - public void testExcelFormat2() throws Exception { + void testExcelFormat2() throws Exception { final String code = "foo,baar\r\n\r\nhello,\r\n\r\nworld,\r\n"; final String[][] res = { { "foo", "baar" }, { "" }, { "hello", "" }, { "" }, { "world", "" } }; - try (final CSVParser parser = CSVParser.parse(code, CSVFormat.EXCEL)) { + try (CSVParser parser = CSVParser.parse(code, CSVFormat.EXCEL)) { final List records = parser.getRecords(); assertEquals(res.length, records.size()); - assertTrue(records.size() > 0); + assertFalse(records.isEmpty()); for (int i = 0; i < res.length; i++) { - assertArrayEquals(res[i], records.get(i).values()); + assertValuesEquals(res[i], records.get(i)); } } } @@ -403,41 +617,235 @@ public void testExcelFormat2() throws Exception { * Tests an exported Excel worksheet with a header row and rows that have more columns than the headers */ @Test - public void testExcelHeaderCountLessThanData() throws Exception { + void testExcelHeaderCountLessThanData() throws Exception { final String code = "A,B,C,,\r\na,b,c,d,e\r\n"; - try (final CSVParser parser = CSVParser.parse(code, CSVFormat.EXCEL.withHeader())) { - for (final CSVRecord record : parser.getRecords()) { - Assert.assertEquals("a", record.get("A")); - Assert.assertEquals("b", record.get("B")); - Assert.assertEquals("c", record.get("C")); - } + try (CSVParser parser = CSVParser.parse(code, EXCEL_WITH_HEADER)) { + parser.getRecords().forEach(record -> { + assertEquals("a", record.get("A")); + assertEquals("b", record.get("B")); + assertEquals("c", record.get("C")); + }); + } + } + + @Test + void testFirstEndOfLineCr() throws IOException { + final String data = "foo\rbaar,\rhello,world\r,kanu"; + try (CSVParser parser = CSVParser.parse(data, CSVFormat.DEFAULT)) { + final List records = parser.getRecords(); + assertEquals(4, records.size()); + assertEquals("\r", parser.getFirstEndOfLine()); + } + } + + @Test + void testFirstEndOfLineCrLf() throws IOException { + final String data = "foo\r\nbaar,\r\nhello,world\r\n,kanu"; + try (CSVParser parser = CSVParser.parse(data, CSVFormat.DEFAULT)) { + final List records = parser.getRecords(); + assertEquals(4, records.size()); + assertEquals("\r\n", parser.getFirstEndOfLine()); + } + } + + @Test + void testFirstEndOfLineLf() throws IOException { + final String data = "foo\nbaar,\nhello,world\n,kanu"; + try (CSVParser parser = CSVParser.parse(data, CSVFormat.DEFAULT)) { + final List records = parser.getRecords(); + assertEquals(4, records.size()); + assertEquals("\n", parser.getFirstEndOfLine()); } } @Test - public void testForEach() throws Exception { - final List records = new ArrayList<>(); - try (final Reader in = new StringReader("a,b,c\n1,2,3\nx,y,z")) { - for (final CSVRecord record : CSVFormat.DEFAULT.parse(in)) { + void testForEach() throws Exception { + try (Reader in = new StringReader("a,b,c\n1,2,3\nx,y,z"); + CSVParser parser = CSVFormat.DEFAULT.parse(in)) { + final List records = new ArrayList<>(); + for (final CSVRecord record : parser) { records.add(record); } assertEquals(3, records.size()); - assertArrayEquals(new String[] { "a", "b", "c" }, records.get(0).values()); - assertArrayEquals(new String[] { "1", "2", "3" }, records.get(1).values()); - assertArrayEquals(new String[] { "x", "y", "z" }, records.get(2).values()); + assertValuesEquals(new String[] { "a", "b", "c" }, records.get(0)); + assertValuesEquals(new String[] { "1", "2", "3" }, records.get(1)); + assertValuesEquals(new String[] { "x", "y", "z" }, records.get(2)); + } + } + + @Test + void testGetBytePositionMultiCharacterDelimiter() throws IOException { + final String code = "aa[|]bb\ncc[|]dd\n"; + final CSVFormat format = CSVFormat.DEFAULT.builder().setDelimiter("[|]").get(); + try (CSVParser parser = CSVParser.builder() + .setReader(new StringReader(code)) + .setFormat(format) + .setCharset(StandardCharsets.UTF_8) + .setTrackBytes(true) + .get()) { + final Iterator it = parser.iterator(); + final CSVRecord first = it.next(); + final CSVRecord second = it.next(); + assertEquals(0, first.getBytePosition()); + assertEquals(8, second.getBytePosition()); + } + } + + /** + * Tests CSV-329. + */ + @Test + void testGetBytePositionMultiCharacterDelimiterWithSupplementaryCharacter() throws IOException { + final String delimiter = "x😀"; + final String code = "ax😀b\ncx😀d\n"; + final CSVFormat format = CSVFormat.DEFAULT.builder().setDelimiter(delimiter).get(); + try (CSVParser parser = CSVParser.builder() + .setReader(new StringReader(code)) + .setFormat(format) + .setCharset(UTF_8) + .setTrackBytes(true) + .get()) { + final CSVRecord first = parser.nextRecord(); + final CSVRecord second = parser.nextRecord(); + assertNotNull(first); + assertNotNull(second); + assertValuesEquals(new String[] { "a", "b" }, first); + assertValuesEquals(new String[] { "c", "d" }, second); + assertEquals(0, first.getBytePosition()); + assertEquals("ax😀b\n".getBytes(UTF_8).length, second.getBytePosition()); + } + } + + @Test + void testGetBytePositionWithCharacterOffsetAndMultiBytePrefix() throws Exception { + final String row0 = "é,x\n"; + final Charset charset = UTF_8; + // row0 char count is 4 + assertEquals(4, row0.length()); + // row0 byte count is 5 + final int record1ByteOffset = row0.getBytes(charset).length; + assertEquals(5, record1ByteOffset); + final String row1 = "b,c\n"; + final String rows = row0 + row1; + final long record1CharOffset = row0.length(); + final long expectedByteOffset = row0.getBytes(charset).length; + try (CSVParser parser = CSVParser.builder() + .setReader(new StringReader(row1)) + .setFormat(CSVFormat.DEFAULT) + .setCharset(charset) + .setTrackBytes(true) + .setByteOffset(record1ByteOffset) + .setCharacterOffset(record1CharOffset) + .setRecordNumber(2) // not relevant but a better use case example. + .get()) { + final CSVRecord record = parser.nextRecord(); + assertNotNull(record); + assertEquals(4, record.getCharacterPosition()); + assertEquals(record1CharOffset, record.getCharacterPosition()); + assertEquals(expectedByteOffset, record.getBytePosition()); + } + } + + @Test + void testGetBytePositionWithSingleByteCharset() throws IOException { + // A single-byte charset cannot encode U+FFFF, the char value of the EOF sentinel. + // Byte counting must skip the EOF read so a valid file parses without throwing. + final String code = "a,b\nc,d\n"; + try (CSVParser parser = CSVParser.builder() + .setReader(new StringReader(code)) + .setFormat(CSVFormat.DEFAULT) + .setCharset(StandardCharsets.ISO_8859_1) + .setTrackBytes(true) + .get()) { + final CSVRecord first = parser.nextRecord(); + final CSVRecord second = parser.nextRecord(); + assertNotNull(first); + assertNotNull(second); + assertNull(parser.nextRecord()); + assertEquals(0, first.getBytePosition()); + assertEquals(4, second.getBytePosition()); + } + } + + @Test + void testGetHeaderComment_HeaderComment1() throws IOException { + try (CSVParser parser = CSVParser.parse(CSV_INPUT_HEADER_COMMENT, FORMAT_AUTO_HEADER)) { + parser.getRecords(); + // Expect a header comment + assertTrue(parser.hasHeaderComment()); + assertEquals("header comment", parser.getHeaderComment()); + } + } + + @Test + void testGetHeaderComment_HeaderComment2() throws IOException { + try (CSVParser parser = CSVParser.parse(CSV_INPUT_HEADER_COMMENT, FORMAT_EXPLICIT_HEADER)) { + parser.getRecords(); + // Expect a header comment + assertTrue(parser.hasHeaderComment()); + assertEquals("header comment", parser.getHeaderComment()); + } + } + + @Test + void testGetHeaderComment_HeaderComment3() throws IOException { + try (CSVParser parser = CSVParser.parse(CSV_INPUT_HEADER_COMMENT, FORMAT_EXPLICIT_HEADER_NOSKIP)) { + parser.getRecords(); + // Expect no header comment - the text "comment" is attached to the first record + assertFalse(parser.hasHeaderComment()); + assertNull(parser.getHeaderComment()); + } + } + + @Test + void testGetHeaderComment_HeaderTrailerComment() throws IOException { + try (CSVParser parser = CSVParser.parse(CSV_INPUT_MULTILINE_HEADER_TRAILER_COMMENT, FORMAT_AUTO_HEADER)) { + parser.getRecords(); + // Expect a header comment + assertTrue(parser.hasHeaderComment()); + assertEquals("multi-line" + LF + "header comment", parser.getHeaderComment()); + } + } + + @Test + void testGetHeaderComment_NoComment1() throws IOException { + try (CSVParser parser = CSVParser.parse(CSV_INPUT_NO_COMMENT, FORMAT_AUTO_HEADER)) { + parser.getRecords(); + // Expect no header comment + assertFalse(parser.hasHeaderComment()); + assertNull(parser.getHeaderComment()); + } + } + + @Test + void testGetHeaderComment_NoComment2() throws IOException { + try (CSVParser parser = CSVParser.parse(CSV_INPUT_NO_COMMENT, FORMAT_EXPLICIT_HEADER)) { + parser.getRecords(); + // Expect no header comment + assertFalse(parser.hasHeaderComment()); + assertNull(parser.getHeaderComment()); + } + } + + @Test + void testGetHeaderComment_NoComment3() throws IOException { + try (CSVParser parser = CSVParser.parse(CSV_INPUT_NO_COMMENT, FORMAT_EXPLICIT_HEADER_NOSKIP)) { + parser.getRecords(); + // Expect no header comment + assertFalse(parser.hasHeaderComment()); + assertNull(parser.getHeaderComment()); } } @Test - public void testGetHeaderMap() throws Exception { - try (final CSVParser parser = CSVParser.parse("a,b,c\n1,2,3\nx,y,z", - CSVFormat.DEFAULT.withHeader("A", "B", "C"))) { + void testGetHeaderMap() throws Exception { + try (CSVParser parser = CSVParser.parse("a,b,c\n1,2,3\nx,y,z", CSVFormat.DEFAULT.withHeader("A", "B", "C"))) { final Map headerMap = parser.getHeaderMap(); final Iterator columnNames = headerMap.keySet().iterator(); // Headers are iterated in column order. - Assert.assertEquals("A", columnNames.next()); - Assert.assertEquals("B", columnNames.next()); - Assert.assertEquals("C", columnNames.next()); + assertEquals("A", columnNames.next()); + assertEquals("B", columnNames.next()); + assertEquals("C", columnNames.next()); final Iterator records = parser.iterator(); // Parse to make sure getHeaderMap did not have a side-effect. @@ -454,10 +862,33 @@ public void testGetHeaderMap() throws Exception { } @Test - public void testGetLine() throws IOException { - try (final CSVParser parser = CSVParser.parse(CSV_INPUT, CSVFormat.DEFAULT.withIgnoreSurroundingSpaces())) { + void testGetHeaderNames() throws IOException { + try (CSVParser parser = CSVParser.parse("a,b,c\n1,2,3\nx,y,z", CSVFormat.DEFAULT.withHeader("A", "B", "C"))) { + final Map nameIndexMap = parser.getHeaderMap(); + final List headerNames = parser.getHeaderNames(); + assertNotNull(headerNames); + assertEquals(nameIndexMap.size(), headerNames.size()); + for (int i = 0; i < headerNames.size(); i++) { + final String name = headerNames.get(i); + assertEquals(i, nameIndexMap.get(name).intValue()); + } + } + } + + @Test + void testGetHeaderNamesReadOnly() throws IOException { + try (CSVParser parser = CSVParser.parse("a,b,c\n1,2,3\nx,y,z", CSVFormat.DEFAULT.withHeader("A", "B", "C"))) { + final List headerNames = parser.getHeaderNames(); + assertNotNull(headerNames); + assertThrows(UnsupportedOperationException.class, () -> headerNames.add("This is a read-only list.")); + } + } + + @Test + void testGetLine() throws IOException { + try (CSVParser parser = CSVParser.parse(CSV_INPUT, CSVFormat.DEFAULT.withIgnoreSurroundingSpaces())) { for (final String[] re : RESULT) { - assertArrayEquals(re, parser.nextRecord().values()); + assertValuesEquals(re, parser.nextRecord()); } assertNull(parser.nextRecord()); @@ -465,90 +896,202 @@ public void testGetLine() throws IOException { } @Test - public void testGetLineNumberWithCR() throws Exception { - this.validateLineNumbers(String.valueOf(CR)); + void testGetLineNumberWithCR() throws Exception { + validateLineNumbers(String.valueOf(CR)); } @Test - public void testGetLineNumberWithCRLF() throws Exception { - this.validateLineNumbers(CRLF); + void testGetLineNumberWithCRLF() throws Exception { + validateLineNumbers(CRLF); } @Test - public void testGetLineNumberWithLF() throws Exception { - this.validateLineNumbers(String.valueOf(LF)); + void testGetLineNumberWithLF() throws Exception { + validateLineNumbers(String.valueOf(LF)); } @Test - public void testGetOneLine() throws IOException { - try (final CSVParser parser = CSVParser.parse(CSV_INPUT_1, CSVFormat.DEFAULT)) { + void testGetOneLine() throws IOException { + try (CSVParser parser = CSVParser.parse(CSV_INPUT_1, CSVFormat.DEFAULT)) { final CSVRecord record = parser.getRecords().get(0); - assertArrayEquals(RESULT[0], record.values()); + assertValuesEquals(RESULT[0], record); } } /** * Tests reusing a parser to process new string records one at a time as they are being discovered. See [CSV-110]. * - * @throws IOException + * @throws IOException when an I/O error occurs. */ @Test - public void testGetOneLineOneParser() throws IOException { + void testGetOneLineOneParser() throws IOException { final CSVFormat format = CSVFormat.DEFAULT; - try (final PipedWriter writer = new PipedWriter(); - final CSVParser parser = new CSVParser(new PipedReader(writer), format)) { + try (PipedWriter writer = new PipedWriter(); + PipedReader origin = new PipedReader(writer); + CSVParser parser = CSVParser.builder() + .setReader(origin) + .setFormat(format) + .get()) { writer.append(CSV_INPUT_1); writer.append(format.getRecordSeparator()); final CSVRecord record1 = parser.nextRecord(); - assertArrayEquals(RESULT[0], record1.values()); + assertValuesEquals(RESULT[0], record1); writer.append(CSV_INPUT_2); writer.append(format.getRecordSeparator()); final CSVRecord record2 = parser.nextRecord(); - assertArrayEquals(RESULT[1], record2.values()); + assertValuesEquals(RESULT[1], record2); + } + } + + @Test + void testGetRecordFourBytesRead() throws Exception { + final String code = "id,a,b,c\n" + + "1,😊,🤔,😂\n" + + "2,😊,🤔,😂\n" + + "3,😊,🤔,😂\n"; + final CSVFormat format = CSVFormat.Builder.create() + .setDelimiter(',') + .setQuote('\'') + .get(); + try (CSVParser parser = CSVParser.builder().setReader(new StringReader(code)).setFormat(format).setCharset(UTF_8).setTrackBytes(true).get()) { + CSVRecord record = new CSVRecord(parser, null, null, 1L, 0L, 0L); + + assertEquals(0, parser.getRecordNumber()); + assertNotNull(record = parser.nextRecord()); + assertEquals(1, record.getRecordNumber()); + assertEquals(code.indexOf('i'), record.getCharacterPosition()); + assertEquals(record.getBytePosition(), record.getCharacterPosition()); + + assertNotNull(record = parser.nextRecord()); + assertEquals(2, record.getRecordNumber()); + assertEquals(code.indexOf('1'), record.getCharacterPosition()); + assertEquals(record.getBytePosition(), record.getCharacterPosition()); + assertNotNull(record = parser.nextRecord()); + assertEquals(3, record.getRecordNumber()); + assertEquals(code.indexOf('2'), record.getCharacterPosition()); + assertEquals(record.getBytePosition(), 26); + assertNotNull(record = parser.nextRecord()); + assertEquals(4, record.getRecordNumber()); + assertEquals(code.indexOf('3'), record.getCharacterPosition()); + assertEquals(record.getBytePosition(), 43); } } @Test - public void testGetRecordNumberWithCR() throws Exception { - this.validateRecordNumbers(String.valueOf(CR)); + void testGetRecordNumberWithCR() throws Exception { + validateRecordNumbers(String.valueOf(CR)); } @Test - public void testGetRecordNumberWithCRLF() throws Exception { - this.validateRecordNumbers(CRLF); + void testGetRecordNumberWithCRLF() throws Exception { + validateRecordNumbers(CRLF); } @Test - public void testGetRecordNumberWithLF() throws Exception { - this.validateRecordNumbers(String.valueOf(LF)); + void testGetRecordNumberWithLF() throws Exception { + validateRecordNumbers(String.valueOf(LF)); } @Test - public void testGetRecordPositionWithCRLF() throws Exception { - this.validateRecordPosition(CRLF); + void testGetRecordPositionWithCRLF() throws Exception { + validateRecordPosition(CRLF); } @Test - public void testGetRecordPositionWithLF() throws Exception { - this.validateRecordPosition(String.valueOf(LF)); + void testGetRecordPositionWithLF() throws Exception { + validateRecordPosition(String.valueOf(LF)); } @Test - public void testGetRecords() throws IOException { - try (final CSVParser parser = CSVParser.parse(CSV_INPUT, CSVFormat.DEFAULT.withIgnoreSurroundingSpaces())) { + void testGetRecords() throws IOException { + try (CSVParser parser = CSVParser.parse(CSV_INPUT, CSVFormat.DEFAULT.withIgnoreSurroundingSpaces())) { final List records = parser.getRecords(); assertEquals(RESULT.length, records.size()); - assertTrue(records.size() > 0); + assertFalse(records.isEmpty()); for (int i = 0; i < RESULT.length; i++) { - assertArrayEquals(RESULT[i], records.get(i).values()); + assertValuesEquals(RESULT[i], records.get(i)); + } + } + } + + @Test + void testGetRecordsFromBrokenInputStream() throws IOException { + @SuppressWarnings("resource") // We also get an exception on close, which is OK but can't assert in a try. + final CSVParser parser = CSVParser.parse(new BrokenInputStream(), UTF_8, CSVFormat.DEFAULT); + assertThrows(UncheckedIOException.class, parser::getRecords); + + } + + @ParameterizedTest + @ValueSource(longs = { -1, 0, 1, 2, 3, 4, Long.MAX_VALUE }) + void testGetRecordsMaxRows(final long maxRows) throws IOException { + try (CSVParser parser = CSVParser.parse(CSV_INPUT, CSVFormat.DEFAULT.builder().setIgnoreSurroundingSpaces(true).setMaxRows(maxRows).get())) { + final List records = parser.getRecords(); + final long expectedLength = maxRows <= 0 || maxRows > RESULT.length ? RESULT.length : maxRows; + assertEquals(expectedLength, records.size()); + assertFalse(records.isEmpty()); + for (int i = 0; i < expectedLength; i++) { + assertValuesEquals(RESULT[i], records.get(i)); } } } + /** + * Tests CSV-327. + */ + @Test + void testGetRecordsMaxRowsWithRecordNumberOffset() throws IOException { + try (CSVParser parser = CSVParser.builder() + .setReader(new StringReader("a,b\nc,d\n")) + .setFormat(CSVFormat.DEFAULT.builder().setMaxRows(1).get()) + .setRecordNumber(2) + .get()) { + final List records = parser.getRecords(); + assertEquals(1, records.size()); + assertEquals(2, records.get(0).getRecordNumber()); + assertValuesEquals(new String[] { "a", "b" }, records.get(0)); + } + } + + @Test + void testGetRecordThreeBytesRead() throws Exception { + final String code = "id,date,val5,val4\n" + + "11111111111111,'4017-09-01',きちんと節分近くには咲いてる～,v4\n" + + "22222222222222,'4017-01-01',おはよう私の友人～,v4\n" + + "33333333333333,'4017-01-01',きる自然の力ってすごいな～,v4\n"; + final CSVFormat format = CSVFormat.Builder.create() + .setDelimiter(',') + .setQuote('\'') + .get(); + try (CSVParser parser = CSVParser.builder().setReader(new StringReader(code)).setFormat(format).setCharset(UTF_8).setTrackBytes(true).get()) { + CSVRecord record = new CSVRecord(parser, null, null, 1L, 0L, 0L); + + assertEquals(0, parser.getRecordNumber()); + assertNotNull(record = parser.nextRecord()); + assertEquals(1, record.getRecordNumber()); + assertEquals(code.indexOf('i'), record.getCharacterPosition()); + assertEquals(record.getBytePosition(), record.getCharacterPosition()); + + assertNotNull(record = parser.nextRecord()); + assertEquals(2, record.getRecordNumber()); + assertEquals(code.indexOf('1'), record.getCharacterPosition()); + assertEquals(record.getBytePosition(), record.getCharacterPosition()); + + assertNotNull(record = parser.nextRecord()); + assertEquals(3, record.getRecordNumber()); + assertEquals(code.indexOf('2'), record.getCharacterPosition()); + assertEquals(record.getBytePosition(), 95); + + assertNotNull(record = parser.nextRecord()); + assertEquals(4, record.getRecordNumber()); + assertEquals(code.indexOf('3'), record.getCharacterPosition()); + assertEquals(record.getBytePosition(), 154); + } + } + @Test - public void testGetRecordWithMultiLineValues() throws Exception { - try (final CSVParser parser = CSVParser.parse( - "\"a\r\n1\",\"a\r\n2\"" + CRLF + "\"b\r\n1\",\"b\r\n2\"" + CRLF + "\"c\r\n1\",\"c\r\n2\"", + void testGetRecordWithMultiLineValues() throws Exception { + try (CSVParser parser = CSVParser.parse("\"a\r\n1\",\"a\r\n2\"" + CRLF + "\"b\r\n1\",\"b\r\n2\"" + CRLF + "\"c\r\n1\",\"c\r\n2\"", CSVFormat.DEFAULT.withRecordSeparator(CRLF))) { CSVRecord record; assertEquals(0, parser.getRecordNumber()); @@ -562,425 +1105,923 @@ public void testGetRecordWithMultiLineValues() throws Exception { assertEquals(2, record.getRecordNumber()); assertEquals(2, parser.getRecordNumber()); assertNotNull(record = parser.nextRecord()); - assertEquals(8, parser.getCurrentLineNumber()); + assertEquals(9, parser.getCurrentLineNumber()); assertEquals(3, record.getRecordNumber()); assertEquals(3, parser.getRecordNumber()); assertNull(record = parser.nextRecord()); - assertEquals(8, parser.getCurrentLineNumber()); + assertEquals(9, parser.getCurrentLineNumber()); assertEquals(3, parser.getRecordNumber()); } } @Test - public void testHeader() throws Exception { - final Reader in = new StringReader("a,b,c\n1,2,3\nx,y,z"); - - final Iterator records = CSVFormat.DEFAULT.withHeader().parse(in).iterator(); - - for (int i = 0; i < 2; i++) { - assertTrue(records.hasNext()); - final CSVRecord record = records.next(); - assertEquals(record.get(0), record.get("a")); - assertEquals(record.get(1), record.get("b")); - assertEquals(record.get(2), record.get("c")); + void testGetTrailerComment_HeaderComment1() throws IOException { + try (CSVParser parser = CSVParser.parse(CSV_INPUT_HEADER_COMMENT, FORMAT_AUTO_HEADER)) { + parser.getRecords(); + assertFalse(parser.hasTrailerComment()); + assertNull(parser.getTrailerComment()); } - - assertFalse(records.hasNext()); } @Test - public void testHeaderComment() throws Exception { - final Reader in = new StringReader("# comment\na,b,c\n1,2,3\nx,y,z"); - - final Iterator records = CSVFormat.DEFAULT.withCommentMarker('#').withHeader().parse(in).iterator(); - - for (int i = 0; i < 2; i++) { - assertTrue(records.hasNext()); - final CSVRecord record = records.next(); - assertEquals(record.get(0), record.get("a")); - assertEquals(record.get(1), record.get("b")); - assertEquals(record.get(2), record.get("c")); + void testGetTrailerComment_HeaderComment2() throws IOException { + try (CSVParser parser = CSVParser.parse(CSV_INPUT_HEADER_COMMENT, FORMAT_EXPLICIT_HEADER)) { + parser.getRecords(); + assertFalse(parser.hasTrailerComment()); + assertNull(parser.getTrailerComment()); } - - assertFalse(records.hasNext()); } @Test - public void testHeaderMissing() throws Exception { - final Reader in = new StringReader("a,,c\n1,2,3\nx,y,z"); - - final Iterator records = CSVFormat.DEFAULT.withHeader().parse(in).iterator(); - - for (int i = 0; i < 2; i++) { - assertTrue(records.hasNext()); - final CSVRecord record = records.next(); - assertEquals(record.get(0), record.get("a")); - assertEquals(record.get(2), record.get("c")); + void testGetTrailerComment_HeaderComment3() throws IOException { + try (CSVParser parser = CSVParser.parse(CSV_INPUT_HEADER_COMMENT, FORMAT_EXPLICIT_HEADER_NOSKIP)) { + parser.getRecords(); + assertFalse(parser.hasTrailerComment()); + assertNull(parser.getTrailerComment()); } - - assertFalse(records.hasNext()); - } - - @Test - public void testHeaderMissingWithNull() throws Exception { - final Reader in = new StringReader("a,,c,,d\n1,2,3,4\nx,y,z,zz"); - CSVFormat.DEFAULT.withHeader().withNullString("").withAllowMissingColumnNames().parse(in).iterator(); } @Test - public void testHeadersMissing() throws Exception { - final Reader in = new StringReader("a,,c,,d\n1,2,3,4\nx,y,z,zz"); - CSVFormat.DEFAULT.withHeader().withAllowMissingColumnNames().parse(in).iterator(); - } - - @Test(expected = IllegalArgumentException.class) - public void testHeadersMissingException() throws Exception { - final Reader in = new StringReader("a,,c,,d\n1,2,3,4\nx,y,z,zz"); - CSVFormat.DEFAULT.withHeader().parse(in).iterator(); + void testGetTrailerComment_HeaderTrailerComment1() throws IOException { + try (CSVParser parser = CSVParser.parse(CSV_INPUT_HEADER_TRAILER_COMMENT, FORMAT_AUTO_HEADER)) { + parser.getRecords(); + assertTrue(parser.hasTrailerComment()); + assertEquals("comment", parser.getTrailerComment()); + } } @Test - public void testIgnoreCaseHeaderMapping() throws Exception { - final Reader in = new StringReader("1,2,3"); - final Iterator records = CSVFormat.DEFAULT.withHeader("One", "TWO", "three").withIgnoreHeaderCase() - .parse(in).iterator(); - final CSVRecord record = records.next(); - assertEquals("1", record.get("one")); - assertEquals("2", record.get("two")); - assertEquals("3", record.get("THREE")); + void testGetTrailerComment_HeaderTrailerComment2() throws IOException { + try (CSVParser parser = CSVParser.parse(CSV_INPUT_HEADER_TRAILER_COMMENT, FORMAT_EXPLICIT_HEADER)) { + parser.getRecords(); + assertTrue(parser.hasTrailerComment()); + assertEquals("comment", parser.getTrailerComment()); + } } @Test - public void testIgnoreEmptyLines() throws IOException { - final String code = "\nfoo,baar\n\r\n,\n\n,world\r\n\n"; - // String code = "world\r\n\n"; - // String code = "foo;baar\r\n\r\nhello;\r\n\r\nworld;\r\n"; - try (final CSVParser parser = CSVParser.parse(code, CSVFormat.DEFAULT)) { - final List records = parser.getRecords(); - assertEquals(3, records.size()); + void testGetTrailerComment_HeaderTrailerComment3() throws IOException { + try (CSVParser parser = CSVParser.parse(CSV_INPUT_HEADER_TRAILER_COMMENT, FORMAT_EXPLICIT_HEADER_NOSKIP)) { + parser.getRecords(); + assertTrue(parser.hasTrailerComment()); + assertEquals("comment", parser.getTrailerComment()); } } - @Test(expected = IllegalArgumentException.class) - public void testInvalidFormat() throws Exception { - final CSVFormat invalidFormat = CSVFormat.DEFAULT.withDelimiter(CR); - try (final CSVParser parser = new CSVParser(null, invalidFormat)) { - Assert.fail("This test should have thrown an exception."); + @Test + void testGetTrailerComment_MultilineComment() throws IOException { + try (CSVParser parser = CSVParser.parse(CSV_INPUT_MULTILINE_HEADER_TRAILER_COMMENT, FORMAT_AUTO_HEADER)) { + parser.getRecords(); + assertTrue(parser.hasTrailerComment()); + assertEquals("multi-line" + LF + "comment", parser.getTrailerComment()); } } @Test - public void testIterator() throws Exception { + void testHeader() throws Exception { final Reader in = new StringReader("a,b,c\n1,2,3\nx,y,z"); - final Iterator iterator = CSVFormat.DEFAULT.parse(in).iterator(); - - assertTrue(iterator.hasNext()); - try { - iterator.remove(); - fail("expected UnsupportedOperationException"); - } catch (final UnsupportedOperationException expected) { - // expected - } - assertArrayEquals(new String[] { "a", "b", "c" }, iterator.next().values()); - assertArrayEquals(new String[] { "1", "2", "3" }, iterator.next().values()); - assertTrue(iterator.hasNext()); - assertTrue(iterator.hasNext()); - assertTrue(iterator.hasNext()); - assertArrayEquals(new String[] { "x", "y", "z" }, iterator.next().values()); - assertFalse(iterator.hasNext()); + try (CSVParser parser = CSVFormat.DEFAULT.withHeader().parse(in)) { + final Iterator records = parser.iterator(); - try { - iterator.next(); - fail("NoSuchElementException expected"); - } catch (final NoSuchElementException e) { - // expected - } + for (int i = 0; i < 2; i++) { + assertTrue(records.hasNext()); + final CSVRecord record = records.next(); + assertEquals(record.get(0), record.get("a")); + assertEquals(record.get(1), record.get("b")); + assertEquals(record.get(2), record.get("c")); + } + + assertFalse(records.hasNext()); + } + } + + @Test + void testHeaderComment() throws Exception { + final Reader in = new StringReader("# comment\na,b,c\n1,2,3\nx,y,z"); + try (CSVParser parser = CSVFormat.DEFAULT.withCommentMarker('#').withHeader().parse(in)) { + final Iterator records = parser.iterator(); + for (int i = 0; i < 2; i++) { + assertTrue(records.hasNext()); + final CSVRecord record = records.next(); + assertEquals(record.get(0), record.get("a")); + assertEquals(record.get(1), record.get("b")); + assertEquals(record.get(2), record.get("c")); + } + assertFalse(records.hasNext()); + } + } + + @Test + void testHeaderMissing() throws Exception { + final Reader in = new StringReader("a,,c\n1,2,3\nx,y,z"); + try (CSVParser parser = CSVFormat.DEFAULT.withHeader().withAllowMissingColumnNames().parse(in)) { + final Iterator records = parser.iterator(); + for (int i = 0; i < 2; i++) { + assertTrue(records.hasNext()); + final CSVRecord record = records.next(); + assertEquals(record.get(0), record.get("a")); + assertEquals(record.get(2), record.get("c")); + } + assertFalse(records.hasNext()); + } + } + + @Test + void testHeaderMissingWithNull() throws Exception { + final Reader in = new StringReader("a,,c,,e\n1,2,3,4,5\nv,w,x,y,z"); + try (CSVParser parser = CSVFormat.DEFAULT.withHeader().withNullString("").withAllowMissingColumnNames().parse(in)) { + parser.iterator(); + } + } + + @Test + void testHeadersMissing() throws Exception { + try (Reader in = new StringReader("a,,c,,e\n1,2,3,4,5\nv,w,x,y,z"); + CSVParser parser = CSVFormat.DEFAULT.withHeader().withAllowMissingColumnNames().parse(in)) { + parser.iterator(); + } + } + + @Test + void testHeadersMissingException() { + final Reader in = new StringReader("a,,c,,e\n1,2,3,4,5\nv,w,x,y,z"); + assertThrows(IllegalArgumentException.class, () -> CSVFormat.DEFAULT.withHeader().parse(in).iterator()); + } + + @Test + void testHeadersMissingOneColumnException() { + final Reader in = new StringReader("a,,c,d,e\n1,2,3,4,5\nv,w,x,y,z"); + assertThrows(IllegalArgumentException.class, () -> CSVFormat.DEFAULT.withHeader().parse(in).iterator()); + } + + @Test + void testHeadersWithNullColumnName() throws IOException { + final Reader in = new StringReader("header1,null,header3\n1,2,3\n4,5,6"); + try (CSVParser parser = CSVFormat.DEFAULT.withHeader().withNullString("null").withAllowMissingColumnNames().parse(in)) { + final Iterator records = parser.iterator(); + final CSVRecord record = records.next(); + // Expect the null header to be missing + @SuppressWarnings("resource") + final CSVParser recordParser = record.getParser(); + assertEquals(Arrays.asList("header1", "header3"), recordParser.getHeaderNames()); + assertEquals(2, recordParser.getHeaderMap().size()); + } + } + + @Test + void testIgnoreCaseHeaderMapping() throws Exception { + final Reader reader = new StringReader("1,2,3"); + try (CSVParser parser = CSVFormat.DEFAULT.withHeader("One", "TWO", "three").withIgnoreHeaderCase().parse(reader)) { + final Iterator records = parser.iterator(); + final CSVRecord record = records.next(); + assertEquals("1", record.get("one")); + assertEquals("2", record.get("two")); + assertEquals("3", record.get("THREE")); + } + } + + @Test + void testIgnoreEmptyLines() throws IOException { + final String code = "\nfoo,baar\n\r\n,\n\n,world\r\n\n"; + // String code = "world\r\n\n"; + // String code = "foo;baar\r\n\r\nhello;\r\n\r\nworld;\r\n"; + try (CSVParser parser = CSVParser.parse(code, CSVFormat.DEFAULT)) { + final List records = parser.getRecords(); + assertEquals(3, records.size()); + } + } + + @Test + void testInvalidFormat() { + assertThrows(IllegalArgumentException.class, () -> CSVFormat.DEFAULT.withDelimiter(CR)); + } + + @Test + void testIterator() throws Exception { + final Reader in = new StringReader("a,b,c\n1,2,3\nx,y,z"); + try (CSVParser parser = CSVFormat.DEFAULT.parse(in)) { + final Iterator iterator = parser.iterator(); + assertTrue(iterator.hasNext()); + assertThrows(UnsupportedOperationException.class, iterator::remove); + assertValuesEquals(new String[] { "a", "b", "c" }, iterator.next()); + assertValuesEquals(new String[] { "1", "2", "3" }, iterator.next()); + assertTrue(iterator.hasNext()); + assertTrue(iterator.hasNext()); + assertTrue(iterator.hasNext()); + assertValuesEquals(new String[] { "x", "y", "z" }, iterator.next()); + assertFalse(iterator.hasNext()); + assertThrows(NoSuchElementException.class, iterator::next); + } + } + + @ParameterizedTest + @ValueSource(longs = { -1, 0, 1, 2, 3, 4, 5, Long.MAX_VALUE }) + void testIteratorMaxRows(final long maxRows) throws Exception { + final Reader in = new StringReader("a,b,c\n1,2,3\nx,y,z"); + try (CSVParser parser = CSVFormat.DEFAULT.builder().setMaxRows(maxRows).get().parse(in)) { + final Iterator iterator = parser.iterator(); + assertTrue(iterator.hasNext()); + assertThrows(UnsupportedOperationException.class, iterator::remove); + assertValuesEquals(new String[] { "a", "b", "c" }, iterator.next()); + final boolean noLimit = maxRows <= 0; + final int fixtureLen = 3; + final long expectedLen = noLimit ? fixtureLen : Math.min(fixtureLen, maxRows); + if (expectedLen > 1) { + assertTrue(iterator.hasNext()); + assertValuesEquals(new String[] { "1", "2", "3" }, iterator.next()); + } + assertEquals(expectedLen > 2, iterator.hasNext()); + // again + assertEquals(expectedLen > 2, iterator.hasNext()); + if (expectedLen == fixtureLen) { + assertTrue(iterator.hasNext()); + assertValuesEquals(new String[] { "x", "y", "z" }, iterator.next()); + } + assertFalse(iterator.hasNext()); + assertThrows(NoSuchElementException.class, iterator::next); + } + } + + @Test + void testIteratorSequenceBreaking() throws IOException { + final String fiveRows = "1\n2\n3\n4\n5\n"; + // Iterator hasNext() shouldn't break sequence + try (CSVParser parser = CSVFormat.DEFAULT.parse(new StringReader(fiveRows))) { + final Iterator iter = parser.iterator(); + int recordNumber = 0; + while (iter.hasNext()) { + final CSVRecord record = iter.next(); + recordNumber++; + assertEquals(String.valueOf(recordNumber), record.get(0)); + if (recordNumber >= 2) { + break; + } + } + iter.hasNext(); + while (iter.hasNext()) { + final CSVRecord record = iter.next(); + recordNumber++; + assertEquals(String.valueOf(recordNumber), record.get(0)); + } + } + // Consecutive enhanced for loops shouldn't break sequence + try (CSVParser parser = CSVFormat.DEFAULT.parse(new StringReader(fiveRows))) { + int recordNumber = 0; + for (final CSVRecord record : parser) { + recordNumber++; + assertEquals(String.valueOf(recordNumber), record.get(0)); + if (recordNumber >= 2) { + break; + } + } + for (final CSVRecord record : parser) { + recordNumber++; + assertEquals(String.valueOf(recordNumber), record.get(0)); + } + } + // Consecutive enhanced for loops with hasNext() peeking shouldn't break sequence + try (CSVParser parser = CSVFormat.DEFAULT.parse(new StringReader(fiveRows))) { + int recordNumber = 0; + for (final CSVRecord record : parser) { + recordNumber++; + assertEquals(String.valueOf(recordNumber), record.get(0)); + if (recordNumber >= 2) { + break; + } + } + parser.iterator().hasNext(); + for (final CSVRecord record : parser) { + recordNumber++; + assertEquals(String.valueOf(recordNumber), record.get(0)); + } + } } @Test - public void testLineFeedEndings() throws IOException { + void testLineFeedEndings() throws IOException { final String code = "foo\nbaar,\nhello,world\n,kanu"; - try (final CSVParser parser = CSVParser.parse(code, CSVFormat.DEFAULT)) { + try (CSVParser parser = CSVParser.parse(code, CSVFormat.DEFAULT)) { final List records = parser.getRecords(); assertEquals(4, records.size()); } } @Test - public void testMappedButNotSetAsOutlook2007ContactExport() throws Exception { + void testMappedButNotSetAsOutlook2007ContactExport() throws Exception { final Reader in = new StringReader("a,b,c\n1,2\nx,y,z"); - final Iterator records = CSVFormat.DEFAULT.withHeader("A", "B", "C").withSkipHeaderRecord().parse(in) - .iterator(); - CSVRecord record; - - // 1st record - record = records.next(); - assertTrue(record.isMapped("A")); - assertTrue(record.isMapped("B")); - assertTrue(record.isMapped("C")); - assertTrue(record.isSet("A")); - assertTrue(record.isSet("B")); - assertFalse(record.isSet("C")); - assertEquals("1", record.get("A")); - assertEquals("2", record.get("B")); - assertFalse(record.isConsistent()); - - // 2nd record - record = records.next(); - assertTrue(record.isMapped("A")); - assertTrue(record.isMapped("B")); - assertTrue(record.isMapped("C")); - assertTrue(record.isSet("A")); - assertTrue(record.isSet("B")); - assertTrue(record.isSet("C")); - assertEquals("x", record.get("A")); - assertEquals("y", record.get("B")); - assertEquals("z", record.get("C")); - assertTrue(record.isConsistent()); + try (CSVParser parser = CSVFormat.DEFAULT.withHeader("A", "B", "C").withSkipHeaderRecord().parse(in)) { + final Iterator records = parser.iterator(); + CSVRecord record; + // 1st record + record = records.next(); + assertTrue(record.isMapped("A")); + assertTrue(record.isMapped("B")); + assertTrue(record.isMapped("C")); + assertTrue(record.isSet("A")); + assertTrue(record.isSet("B")); + assertFalse(record.isSet("C")); + assertEquals("1", record.get("A")); + assertEquals("2", record.get("B")); + assertFalse(record.isConsistent()); + // 2nd record + record = records.next(); + assertTrue(record.isMapped("A")); + assertTrue(record.isMapped("B")); + assertTrue(record.isMapped("C")); + assertTrue(record.isSet("A")); + assertTrue(record.isSet("B")); + assertTrue(record.isSet("C")); + assertEquals("x", record.get("A")); + assertEquals("y", record.get("B")); + assertEquals("z", record.get("C")); + assertTrue(record.isConsistent()); + // end + assertFalse(records.hasNext()); + } + } - assertFalse(records.hasNext()); + @Test + @Disabled + void testMongoDbCsv() throws Exception { + try (CSVParser parser = CSVParser.parse("\"a a\",b,c" + LF + "d,e,f", CSVFormat.MONGODB_CSV)) { + final Iterator itr1 = parser.iterator(); + final Iterator itr2 = parser.iterator(); + + final CSVRecord first = itr1.next(); + assertEquals("a a", first.get(0)); + assertEquals("b", first.get(1)); + assertEquals("c", first.get(2)); + + final CSVRecord second = itr2.next(); + assertEquals("d", second.get(0)); + assertEquals("e", second.get(1)); + assertEquals("f", second.get(2)); + } } @Test // TODO this may lead to strange behavior, throw an exception if iterator() has already been called? - public void testMultipleIterators() throws Exception { - try (final CSVParser parser = CSVParser.parse("a,b,c" + CR + "d,e,f", CSVFormat.DEFAULT)) { + void testMultipleIterators() throws Exception { + try (CSVParser parser = CSVParser.parse("a,b,c" + CRLF + "d,e,f", CSVFormat.DEFAULT)) { final Iterator itr1 = parser.iterator(); - final Iterator itr2 = parser.iterator(); final CSVRecord first = itr1.next(); assertEquals("a", first.get(0)); assertEquals("b", first.get(1)); assertEquals("c", first.get(2)); - final CSVRecord second = itr2.next(); + final CSVRecord second = itr1.next(); assertEquals("d", second.get(0)); assertEquals("e", second.get(1)); assertEquals("f", second.get(2)); } } - @Test(expected = IllegalArgumentException.class) - public void testNewCSVParserNullReaderFormat() throws Exception { - try (final CSVParser parser = new CSVParser(null, CSVFormat.DEFAULT)) { - Assert.fail("This test should have thrown an exception."); + @Test + void testNewCSVParserNullReaderFormat() { + assertThrows(NullPointerException.class, () -> new CSVParser(null, CSVFormat.DEFAULT)); + } + + @Test + void testNewCSVParserReaderNullFormat() { + assertThrows(NullPointerException.class, () -> new CSVParser(new StringReader(""), null)); + } + + @Test + void testNoHeaderMap() throws Exception { + try (CSVParser parser = CSVParser.parse("a,b,c\n1,2,3\nx,y,z", CSVFormat.DEFAULT)) { + assertNull(parser.getHeaderMap()); } } - @Test(expected = IllegalArgumentException.class) - public void testNewCSVParserReaderNullFormat() throws Exception { - try (final CSVParser parser = new CSVParser(new StringReader(""), null)) { - Assert.fail("This test should have thrown an exception."); + @Test + void testNotValueCSV() throws IOException { + final String source = "#"; + final CSVFormat csvFormat = CSVFormat.DEFAULT.withCommentMarker('#'); + try (CSVParser csvParser = csvFormat.parse(new StringReader(source))) { + final CSVRecord csvRecord = csvParser.nextRecord(); + assertNull(csvRecord); } } @Test - public void testNoHeaderMap() throws Exception { - try (final CSVParser parser = CSVParser.parse("a,b,c\n1,2,3\nx,y,z", CSVFormat.DEFAULT)) { - Assert.assertNull(parser.getHeaderMap()); + void testParse() throws Exception { + final URL url = ClassLoader.getSystemClassLoader().getResource("org/apache/commons/csv/CSVFileParser/test.csv"); + final CSVFormat format = CSVFormat.DEFAULT.builder().setHeader("A", "B", "C", "D").get(); + final Charset charset = StandardCharsets.UTF_8; + // Reader + try (CSVParser parser = CSVParser.parse(new InputStreamReader(url.openStream(), charset), format)) { + parseFully(parser); + } + try (CSVParser parser = CSVParser.builder().setReader(new InputStreamReader(url.openStream(), charset)).setFormat(format).get()) { + parseFully(parser); + } + // String + final Path path = Paths.get(url.toURI()); + final String string = new String(Files.readAllBytes(path), charset); + try (CSVParser parser = CSVParser.parse(string, format)) { + parseFully(parser); + } + try (CSVParser parser = CSVParser.builder().setCharSequence(string).setFormat(format).get()) { + parseFully(parser); + } + // File + final File file = new File(url.toURI()); + try (CSVParser parser = CSVParser.parse(file, charset, format)) { + parseFully(parser); + } + try (CSVParser parser = CSVParser.builder().setFile(file).setCharset(charset).setFormat(format).get()) { + parseFully(parser); + } + // InputStream + try (CSVParser parser = CSVParser.parse(url.openStream(), charset, format)) { + parseFully(parser); + } + try (CSVParser parser = CSVParser.builder().setInputStream(url.openStream()).setCharset(charset).setFormat(format).get()) { + parseFully(parser); + } + // Path + try (CSVParser parser = CSVParser.parse(path, charset, format)) { + parseFully(parser); + } + try (CSVParser parser = CSVParser.builder().setPath(path).setCharset(charset).setFormat(format).get()) { + parseFully(parser); + } + // URL + try (CSVParser parser = CSVParser.parse(url, charset, format)) { + parseFully(parser); + } + try (CSVParser parser = CSVParser.builder().setURI(url.toURI()).setCharset(charset).setFormat(format).get()) { + parseFully(parser); + } + // InputStreamReader + try (CSVParser parser = new CSVParser(new InputStreamReader(url.openStream(), charset), format)) { + parseFully(parser); + } + try (CSVParser parser = CSVParser.builder().setReader(new InputStreamReader(url.openStream(), charset)).setFormat(format).get()) { + parseFully(parser); + } + // InputStreamReader with longs + try (CSVParser parser = new CSVParser(new InputStreamReader(url.openStream(), charset), format, /* characterOffset= */0, /* recordNumber= */1)) { + parseFully(parser); + } + try (CSVParser parser = CSVParser.builder().setReader(new InputStreamReader(url.openStream(), charset)).setFormat(format).setCharacterOffset(0) + .setRecordNumber(0).get()) { + parseFully(parser); } } - @Test(expected = IllegalArgumentException.class) - public void testParseFileNullFormat() throws Exception { - CSVParser.parse(new File(""), Charset.defaultCharset(), null); + @Test + void testParseFileCharsetNullFormat() throws IOException { + final File file = new File("src/test/resources/org/apache/commons/csv/CSVFileParser/test.csv"); + try (CSVParser parser = CSVParser.parse(file, Charset.defaultCharset(), null)) { + // null maps to DEFAULT. + parseFully(parser); + } } - @Test(expected = IllegalArgumentException.class) - public void testParseNullFileFormat() throws Exception { - CSVParser.parse((File) null, Charset.defaultCharset(), CSVFormat.DEFAULT); + @Test + void testParseInputStreamCharsetNullFormat() throws IOException { + try (InputStream in = Files.newInputStream(Paths.get("src/test/resources/org/apache/commons/csv/CSVFileParser/test.csv")); + CSVParser parser = CSVParser.parse(in, Charset.defaultCharset(), null)) { + // null maps to DEFAULT. + parseFully(parser); + } } - @Test(expected = IllegalArgumentException.class) - public void testParseNullStringFormat() throws Exception { - CSVParser.parse((String) null, CSVFormat.DEFAULT); + @Test + void testParseNullFileFormat() { + assertThrows(NullPointerException.class, () -> CSVParser.parse((File) null, Charset.defaultCharset(), CSVFormat.DEFAULT)); + } + + @Test + void testParseNullPathFormat() { + assertThrows(NullPointerException.class, () -> CSVParser.parse((Path) null, Charset.defaultCharset(), CSVFormat.DEFAULT)); } - @Test(expected = IllegalArgumentException.class) - public void testParseNullUrlCharsetFormat() throws Exception { - CSVParser.parse((File) null, Charset.defaultCharset(), CSVFormat.DEFAULT); + @Test + void testParseNullStringFormat() { + assertThrows(NullPointerException.class, () -> CSVParser.parse((String) null, CSVFormat.DEFAULT)); + } + + @Test + void testParseNullUrlCharsetFormat() { + assertThrows(NullPointerException.class, () -> CSVParser.parse((URL) null, Charset.defaultCharset(), CSVFormat.DEFAULT)); } - @Test(expected = IllegalArgumentException.class) - public void testParserUrlNullCharsetFormat() throws Exception { - try (final CSVParser parser = CSVParser.parse(new URL("http://commons.apache.org"), null, CSVFormat.DEFAULT)) { - Assert.fail("This test should have thrown an exception."); + @Test + void testParsePathCharsetNullFormat() throws IOException { + final Path path = Paths.get("src/test/resources/org/apache/commons/csv/CSVFileParser/test.csv"); + try (CSVParser parser = CSVParser.parse(path, Charset.defaultCharset(), null)) { + // null maps to DEFAULT. + parseFully(parser); } } - @Test(expected = IllegalArgumentException.class) - public void testParseStringNullFormat() throws Exception { - CSVParser.parse("csv data", null); + @Test + void testParserUrlNullCharsetFormat() throws IOException { + final URL url = ClassLoader.getSystemClassLoader().getResource("org/apache/commons/csv/CSVFileParser/test.csv"); + try (CSVParser parser = CSVParser.parse(url, null, CSVFormat.DEFAULT)) { + // null maps to DEFAULT. + parseFully(parser); + } } - @Test(expected = IllegalArgumentException.class) - public void testParseUrlCharsetNullFormat() throws Exception { - try (final CSVParser parser = CSVParser.parse(new URL("http://commons.apache.org"), Charset.defaultCharset(), null)) { - Assert.fail("This test should have thrown an exception."); + @Test + void testParseStringNullFormat() throws IOException { + try (CSVParser parser = CSVParser.parse("1,2,3", null)) { + // null maps to DEFAULT. + final List records = parser.getRecords(); + assertEquals(1, records.size()); + final CSVRecord record = records.get(0); + assertEquals(3, record.size()); + assertEquals("1", record.get(0)); + assertEquals("2", record.get(1)); + assertEquals("3", record.get(2)); } } @Test - public void testProvidedHeader() throws Exception { - final Reader in = new StringReader("a,b,c\n1,2,3\nx,y,z"); + void testParseUrlCharsetNullFormat() throws IOException { + final URL url = ClassLoader.getSystemClassLoader().getResource("org/apache/commons/csv/CSVFileParser/test.csv"); + try (CSVParser parser = CSVParser.parse(url, Charset.defaultCharset(), null)) { + // null maps to DEFAULT. + parseFully(parser); + } + } - final Iterator records = CSVFormat.DEFAULT.withHeader("A", "B", "C").parse(in).iterator(); + @Test + void testParseWithDelimiterStringWithEscape() throws IOException { + final String source = "a![!|!]b![|]c[|]xyz\r\nabc[abc][|]xyz"; + final CSVFormat csvFormat = CSVFormat.DEFAULT.builder().setDelimiter("[|]").setEscape('!').get(); + try (CSVParser csvParser = csvFormat.parse(new StringReader(source))) { + CSVRecord csvRecord = csvParser.nextRecord(); + assertEquals("a[|]b![|]c", csvRecord.get(0)); + assertEquals("xyz", csvRecord.get(1)); + csvRecord = csvParser.nextRecord(); + assertEquals("abc[abc]", csvRecord.get(0)); + assertEquals("xyz", csvRecord.get(1)); + } + } - for (int i = 0; i < 3; i++) { - assertTrue(records.hasNext()); - final CSVRecord record = records.next(); - assertTrue(record.isMapped("A")); - assertTrue(record.isMapped("B")); - assertTrue(record.isMapped("C")); - assertFalse(record.isMapped("NOT MAPPED")); - assertEquals(record.get(0), record.get("A")); - assertEquals(record.get(1), record.get("B")); - assertEquals(record.get(2), record.get("C")); + @Test + void testParseWithDelimiterStringWithQuote() throws IOException { + final String source = "'a[|]b[|]c'[|]xyz\r\nabc[abc][|]xyz"; + final CSVFormat csvFormat = CSVFormat.DEFAULT.builder().setDelimiter("[|]").setQuote('\'').get(); + try (CSVParser csvParser = csvFormat.parse(new StringReader(source))) { + CSVRecord csvRecord = csvParser.nextRecord(); + assertEquals("a[|]b[|]c", csvRecord.get(0)); + assertEquals("xyz", csvRecord.get(1)); + csvRecord = csvParser.nextRecord(); + assertEquals("abc[abc]", csvRecord.get(0)); + assertEquals("xyz", csvRecord.get(1)); } + } - assertFalse(records.hasNext()); + @Test + void testParseWithDelimiterWithEscape() throws IOException { + final String source = "a!,b!,c,xyz"; + final CSVFormat csvFormat = CSVFormat.DEFAULT.withEscape('!'); + try (CSVParser csvParser = csvFormat.parse(new StringReader(source))) { + final CSVRecord csvRecord = csvParser.nextRecord(); + assertEquals("a,b,c", csvRecord.get(0)); + assertEquals("xyz", csvRecord.get(1)); + } + } + + @Test + void testParseWithDelimiterWithQuote() throws IOException { + final String source = "'a,b,c',xyz"; + final CSVFormat csvFormat = CSVFormat.DEFAULT.withQuote('\''); + try (CSVParser csvParser = csvFormat.parse(new StringReader(source))) { + final CSVRecord csvRecord = csvParser.nextRecord(); + assertEquals("a,b,c", csvRecord.get(0)); + assertEquals("xyz", csvRecord.get(1)); + } + } + + @Test + void testParseWithQuoteThrowsException() { + final CSVFormat csvFormat = CSVFormat.DEFAULT.withQuote('\''); + assertThrows(IOException.class, () -> csvFormat.parse(new StringReader("'a,b,c','")).nextRecord()); + assertThrows(IOException.class, () -> csvFormat.parse(new StringReader("'a,b,c'abc,xyz")).nextRecord()); + assertThrows(IOException.class, () -> csvFormat.parse(new StringReader("'abc'a,b,c',xyz")).nextRecord()); + } + + @Test + void testParseWithQuoteWithEscape() throws IOException { + final String source = "'a?,b?,c?d',xyz"; + final CSVFormat csvFormat = CSVFormat.DEFAULT.withQuote('\'').withEscape('?'); + try (CSVParser csvParser = csvFormat.parse(new StringReader(source))) { + final CSVRecord csvRecord = csvParser.nextRecord(); + assertEquals("a,b,c?d", csvRecord.get(0)); + assertEquals("xyz", csvRecord.get(1)); + } + } + + @ParameterizedTest + @EnumSource(CSVFormat.Predefined.class) + void testParsingPrintedEmptyFirstColumn(final CSVFormat.Predefined format) throws Exception { + final String[][] lines = { { "a", "b" }, { "", "x" } }; + final StringWriter buf = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(buf, format.getFormat())) { + printer.printRecords(Stream.of(lines)); + } + try (CSVParser csvRecords = CSVParser.builder() + .setReader(new StringReader(buf.toString())) + .setFormat(format.getFormat()) + .get()) { + for (final String[] line : lines) { + assertValuesEquals(line, csvRecords.nextRecord()); + } + assertNull(csvRecords.nextRecord()); + } + } + + /** + * A truncated escaped multi-character delimiter at EOF must stay literal data and not be completed from a stale + * escape delimiter look-ahead. + */ + @Test + void testPartialEscapedMultiCharacterDelimiterAtEOF() throws IOException { + final CSVFormat format = CSVFormat.DEFAULT.builder().setDelimiter("[|]").setEscape('!').get(); + try (CSVParser parser = format.parse(new StringReader("x![!|!]y![!|"))) { + final CSVRecord record = parser.nextRecord(); + assertEquals("x[|]y![!|", record.get(0)); + assertEquals(1, record.size()); + } + } + + /** + * Tests CSV-324. + */ + @Test + void testPartialMultiCharacterDelimiterAtEOF() throws IOException { + final CSVFormat format = CSVFormat.DEFAULT.builder().setDelimiter("[|]").get(); + try (CSVParser parser = format.parse(new StringReader("a[|]b[|"))) { + final CSVRecord record = parser.nextRecord(); + assertEquals("a", record.get(0)); + assertEquals("b[|", record.get(1)); + assertEquals(2, record.size()); + } + } + + /** + * A truncated multi-character delimiter at EOF must not be completed from the look-ahead buffer left dirty by an + * earlier non-matching peek in the same token. + */ + @Test + void testPartialMultiCharacterDelimiterAtEOFAfterMismatch() throws IOException { + final CSVFormat format = CSVFormat.DEFAULT.builder().setDelimiter("[|]").get(); + // The "[a]" peek leaves ']' in the look-ahead buffer; the trailing "[|" must not match "[|]". + final String recordString = "x[a][|"; + try (CSVParser parser = format.parse(new StringReader(recordString))) { + final CSVRecord record = parser.nextRecord(); + assertEquals(recordString, record.get(0)); + assertEquals(1, record.size()); + } } @Test - public void testProvidedHeaderAuto() throws Exception { + void testProvidedHeader() throws Exception { final Reader in = new StringReader("a,b,c\n1,2,3\nx,y,z"); + try (CSVParser parser = CSVFormat.DEFAULT.withHeader("A", "B", "C").parse(in)) { + final Iterator records = parser.iterator(); + for (int i = 0; i < 3; i++) { + assertTrue(records.hasNext()); + final CSVRecord record = records.next(); + assertTrue(record.isMapped("A")); + assertTrue(record.isMapped("B")); + assertTrue(record.isMapped("C")); + assertFalse(record.isMapped("NOT MAPPED")); + assertEquals(record.get(0), record.get("A")); + assertEquals(record.get(1), record.get("B")); + assertEquals(record.get(2), record.get("C")); + } + assertFalse(records.hasNext()); + } + } - final Iterator records = CSVFormat.DEFAULT.withHeader().parse(in).iterator(); + @Test + void testProvidedHeaderAuto() throws Exception { + final Reader in = new StringReader("a,b,c\n1,2,3\nx,y,z"); + try (CSVParser parser = CSVFormat.DEFAULT.withHeader().parse(in)) { + final Iterator records = parser.iterator(); + for (int i = 0; i < 2; i++) { + assertTrue(records.hasNext()); + final CSVRecord record = records.next(); + assertTrue(record.isMapped("a")); + assertTrue(record.isMapped("b")); + assertTrue(record.isMapped("c")); + assertFalse(record.isMapped("NOT MAPPED")); + assertEquals(record.get(0), record.get("a")); + assertEquals(record.get(1), record.get("b")); + assertEquals(record.get(2), record.get("c")); + } + assertFalse(records.hasNext()); + } + } - for (int i = 0; i < 2; i++) { - assertTrue(records.hasNext()); + @Test + void testRepeatedHeadersAreReturnedInCSVRecordHeaderNames() throws IOException { + final Reader in = new StringReader("header1,header2,header1\n1,2,3\n4,5,6"); + try (CSVParser parser = CSVFormat.DEFAULT.withFirstRecordAsHeader().withTrim().parse(in)) { + final Iterator records = parser.iterator(); final CSVRecord record = records.next(); - assertTrue(record.isMapped("a")); - assertTrue(record.isMapped("b")); - assertTrue(record.isMapped("c")); - assertFalse(record.isMapped("NOT MAPPED")); - assertEquals(record.get(0), record.get("a")); - assertEquals(record.get(1), record.get("b")); - assertEquals(record.get(2), record.get("c")); + @SuppressWarnings("resource") + final CSVParser recordParser = record.getParser(); + assertEquals(Arrays.asList("header1", "header2", "header1"), recordParser.getHeaderNames()); } - - assertFalse(records.hasNext()); } @Test - public void testRoundtrip() throws Exception { + void testRoundtrip() throws Exception { final StringWriter out = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(out, CSVFormat.DEFAULT)) { - final String input = "a,b,c\r\n1,2,3\r\nx,y,z\r\n"; - for (final CSVRecord record : CSVParser.parse(input, CSVFormat.DEFAULT)) { + final String data = "a,b,c\r\n1,2,3\r\nx,y,z\r\n"; + try (CSVPrinter printer = new CSVPrinter(out, CSVFormat.DEFAULT); + CSVParser parse = CSVParser.parse(data, CSVFormat.DEFAULT)) { + for (final CSVRecord record : parse) { printer.printRecord(record); } - assertEquals(input, out.toString()); + assertEquals(data, out.toString()); } } @Test - public void testSkipAutoHeader() throws Exception { + void testSkipAutoHeader() throws Exception { final Reader in = new StringReader("a,b,c\n1,2,3\nx,y,z"); - final Iterator records = CSVFormat.DEFAULT.withHeader().parse(in).iterator(); - final CSVRecord record = records.next(); - assertEquals("1", record.get("a")); - assertEquals("2", record.get("b")); - assertEquals("3", record.get("c")); + try (CSVParser parser = CSVFormat.DEFAULT.withHeader().parse(in)) { + final Iterator records = parser.iterator(); + final CSVRecord record = records.next(); + assertEquals("1", record.get("a")); + assertEquals("2", record.get("b")); + assertEquals("3", record.get("c")); + } } @Test - public void testSkipHeaderOverrideDuplicateHeaders() throws Exception { + void testSkipHeaderOverrideDuplicateHeaders() throws Exception { final Reader in = new StringReader("a,a,a\n1,2,3\nx,y,z"); - final Iterator records = CSVFormat.DEFAULT.withHeader("X", "Y", "Z").withSkipHeaderRecord().parse(in) - .iterator(); - final CSVRecord record = records.next(); - assertEquals("1", record.get("X")); - assertEquals("2", record.get("Y")); - assertEquals("3", record.get("Z")); + try (CSVParser parser = CSVFormat.DEFAULT.withHeader("X", "Y", "Z").withSkipHeaderRecord().parse(in)) { + final Iterator records = parser.iterator(); + final CSVRecord record = records.next(); + assertEquals("1", record.get("X")); + assertEquals("2", record.get("Y")); + assertEquals("3", record.get("Z")); + } } @Test - public void testSkipSetAltHeaders() throws Exception { + void testSkipSetAltHeaders() throws Exception { final Reader in = new StringReader("a,b,c\n1,2,3\nx,y,z"); - final Iterator records = CSVFormat.DEFAULT.withHeader("X", "Y", "Z").withSkipHeaderRecord().parse(in) - .iterator(); - final CSVRecord record = records.next(); - assertEquals("1", record.get("X")); - assertEquals("2", record.get("Y")); - assertEquals("3", record.get("Z")); + try (CSVParser parser = CSVFormat.DEFAULT.withHeader("X", "Y", "Z").withSkipHeaderRecord().parse(in)) { + final Iterator records = parser.iterator(); + final CSVRecord record = records.next(); + assertEquals("1", record.get("X")); + assertEquals("2", record.get("Y")); + assertEquals("3", record.get("Z")); + } } @Test - public void testSkipSetHeader() throws Exception { + void testSkipSetHeader() throws Exception { final Reader in = new StringReader("a,b,c\n1,2,3\nx,y,z"); - final Iterator records = CSVFormat.DEFAULT.withHeader("a", "b", "c").withSkipHeaderRecord().parse(in) - .iterator(); - final CSVRecord record = records.next(); - assertEquals("1", record.get("a")); - assertEquals("2", record.get("b")); - assertEquals("3", record.get("c")); + try (CSVParser parser = CSVFormat.DEFAULT.withHeader("a", "b", "c").withSkipHeaderRecord().parse(in)) { + final Iterator records = parser.iterator(); + final CSVRecord record = records.next(); + assertEquals("1", record.get("a")); + assertEquals("2", record.get("b")); + assertEquals("3", record.get("c")); + } } @Test - @Ignore - public void testStartWithEmptyLinesThenHeaders() throws Exception { - final String[] codes = { "\r\n\r\n\r\nhello,\r\n\r\n\r\n", "hello,\n\n\n", "hello,\"\"\r\n\r\n\r\n", - "hello,\"\"\n\n\n" }; + @Disabled + void testStartWithEmptyLinesThenHeaders() throws Exception { + final String[] codes = { "\r\n\r\n\r\nhello,\r\n\r\n\r\n", "hello,\n\n\n", "hello,\"\"\r\n\r\n\r\n", "hello,\"\"\n\n\n" }; final String[][] res = { { "hello", "" }, { "" }, // Excel format does not ignore empty lines { "" } }; for (final String code : codes) { - try (final CSVParser parser = CSVParser.parse(code, CSVFormat.EXCEL)) { + try (CSVParser parser = CSVParser.parse(code, CSVFormat.EXCEL)) { final List records = parser.getRecords(); assertEquals(res.length, records.size()); - assertTrue(records.size() > 0); + assertFalse(records.isEmpty()); for (int i = 0; i < res.length; i++) { - assertArrayEquals(res[i], records.get(i).values()); + assertValuesEquals(res[i], records.get(i)); } } } } @Test - public void testTrailingDelimiter() throws Exception { + void testStream() throws Exception { + final Reader in = new StringReader("a,b,c\n1,2,3\nx,y,z"); + try (CSVParser parser = CSVFormat.DEFAULT.parse(in)) { + final List list = parser.stream().collect(Collectors.toList()); + assertFalse(list.isEmpty()); + assertValuesEquals(new String[] { "a", "b", "c" }, list.get(0)); + assertValuesEquals(new String[] { "1", "2", "3" }, list.get(1)); + assertValuesEquals(new String[] { "x", "y", "z" }, list.get(2)); + } + } + + @ParameterizedTest + @ValueSource(longs = { -1, 0, 1, 2, 3, 4, Long.MAX_VALUE }) + void testStreamMaxRows(final long maxRows) throws Exception { + final Reader in = new StringReader("a,b,c\n1,2,3\nx,y,z"); + try (CSVParser parser = CSVFormat.DEFAULT.builder().setMaxRows(maxRows).get().parse(in)) { + final List list = parser.stream().collect(Collectors.toList()); + assertFalse(list.isEmpty()); + assertValuesEquals(new String[] { "a", "b", "c" }, list.get(0)); + if (maxRows <= 0 || maxRows > 1) { + assertValuesEquals(new String[] { "1", "2", "3" }, list.get(1)); + } + if (maxRows <= 0 || maxRows > 2) { + assertValuesEquals(new String[] { "x", "y", "z" }, list.get(2)); + } + } + } + + @Test + void testThrowExceptionWithLineAndPosition() throws IOException { + final String csvContent = "col1,col2,col3,col4,col5,col6,col7,col8,col9,col10\nrec1,rec2,rec3,rec4,rec5,rec6,rec7,rec8,\"\"rec9\"\",rec10"; + final StringReader stringReader = new StringReader(csvContent); + // @formatter:off + final CSVFormat csvFormat = CSVFormat.DEFAULT.builder() + .setHeader() + .setSkipHeaderRecord(true) + .get(); + // @formatter:on + try (CSVParser csvParser = csvFormat.parse(stringReader)) { + final UncheckedIOException exception = assertThrows(UncheckedIOException.class, csvParser::getRecords); + assertInstanceOf(CSVException.class, exception.getCause()); + assertTrue(exception.getMessage().contains("Invalid character between encapsulated token and delimiter at line: 2, position: 94"), + exception::getMessage); + } + } + + @Test + void testTrailingDelimiter() throws Exception { final Reader in = new StringReader("a,a,a,\n\"1\",\"2\",\"3\",\nx,y,z,"); - final Iterator records = CSVFormat.DEFAULT.withHeader("X", "Y", "Z").withSkipHeaderRecord() - .withTrailingDelimiter().parse(in).iterator(); - final CSVRecord record = records.next(); - assertEquals("1", record.get("X")); - assertEquals("2", record.get("Y")); - assertEquals("3", record.get("Z")); - Assert.assertEquals(3, record.size()); + try (CSVParser parser = CSVFormat.DEFAULT.withHeader("X", "Y", "Z").withSkipHeaderRecord().withTrailingDelimiter().parse(in)) { + final Iterator records = parser.iterator(); + final CSVRecord record = records.next(); + assertEquals("1", record.get("X")); + assertEquals("2", record.get("Y")); + assertEquals("3", record.get("Z")); + assertEquals(3, record.size()); + } } @Test - public void testTrim() throws Exception { + void testTrailingDelimiterKeepsQuotedEmptyLastField() throws Exception { + final CSVFormat format = CSVFormat.DEFAULT.builder().setTrailingDelimiter(true).get(); + try (CSVParser parser = CSVParser.parse("a,b,\"\"", format)) { + final CSVRecord record = parser.iterator().next(); + assertEquals(3, record.size()); + assertEquals("a", record.get(0)); + assertEquals("b", record.get(1)); + assertEquals("", record.get(2)); + } + // An unquoted trailing delimiter still drops the empty field. + try (CSVParser parser = CSVParser.parse("a,b,", format)) { + final CSVRecord record = parser.iterator().next(); + assertEquals(2, record.size()); + } + } + + @Test + void testTrim() throws Exception { final Reader in = new StringReader("a,a,a\n\" 1 \",\" 2 \",\" 3 \"\nx,y,z"); - final Iterator records = CSVFormat.DEFAULT.withHeader("X", "Y", "Z").withSkipHeaderRecord() - .withTrim().parse(in).iterator(); - final CSVRecord record = records.next(); - assertEquals("1", record.get("X")); - assertEquals("2", record.get("Y")); - assertEquals("3", record.get("Z")); - Assert.assertEquals(3, record.size()); + try (CSVParser parser = CSVFormat.DEFAULT.withHeader("X", "Y", "Z").withSkipHeaderRecord().withTrim().parse(in)) { + final Iterator records = parser.iterator(); + final CSVRecord record = records.next(); + assertEquals("1", record.get("X")); + assertEquals("2", record.get("Y")); + assertEquals("3", record.get("Z")); + assertEquals(3, record.size()); + } } private void validateLineNumbers(final String lineSeparator) throws IOException { - try (final CSVParser parser = CSVParser.parse("a" + lineSeparator + "b" + lineSeparator + "c", - CSVFormat.DEFAULT.withRecordSeparator(lineSeparator))) { + try (CSVParser parser = CSVParser.parse("a" + lineSeparator + "b" + lineSeparator + "c", CSVFormat.DEFAULT.withRecordSeparator(lineSeparator))) { assertEquals(0, parser.getCurrentLineNumber()); assertNotNull(parser.nextRecord()); assertEquals(1, parser.getCurrentLineNumber()); assertNotNull(parser.nextRecord()); assertEquals(2, parser.getCurrentLineNumber()); assertNotNull(parser.nextRecord()); - // Still 2 because the last line is does not have EOL chars - assertEquals(2, parser.getCurrentLineNumber()); + // Read EOF without EOL should 3 + assertEquals(3, parser.getCurrentLineNumber()); assertNull(parser.nextRecord()); - // Still 2 because the last line is does not have EOL chars - assertEquals(2, parser.getCurrentLineNumber()); + // Read EOF without EOL should 3 + assertEquals(3, parser.getCurrentLineNumber()); } } private void validateRecordNumbers(final String lineSeparator) throws IOException { - try (final CSVParser parser = CSVParser.parse("a" + lineSeparator + "b" + lineSeparator + "c", - CSVFormat.DEFAULT.withRecordSeparator(lineSeparator))) { + try (CSVParser parser = CSVParser.parse("a" + lineSeparator + "b" + lineSeparator + "c", CSVFormat.DEFAULT.withRecordSeparator(lineSeparator))) { CSVRecord record; assertEquals(0, parser.getRecordNumber()); assertNotNull(record = parser.nextRecord()); @@ -999,61 +2040,77 @@ private void validateRecordNumbers(final String lineSeparator) throws IOExceptio private void validateRecordPosition(final String lineSeparator) throws IOException { final String nl = lineSeparator; // used as linebreak in values for better distinction - final String code = "a,b,c" + lineSeparator + "1,2,3" + lineSeparator + // to see if recordPosition correctly points to the enclosing quote "'A" + nl + "A','B" + nl + "B',CC" + lineSeparator + // unicode test... not very relevant while operating on strings instead of bytes, but for // completeness... "\u00c4,\u00d6,\u00dc" + lineSeparator + "EOF,EOF,EOF"; - final CSVFormat format = CSVFormat.newFormat(',').withQuote('\'').withRecordSeparator(lineSeparator); - CSVParser parser = CSVParser.parse(code, format); - - CSVRecord record; - assertEquals(0, parser.getRecordNumber()); - - assertNotNull(record = parser.nextRecord()); - assertEquals(1, record.getRecordNumber()); - assertEquals(code.indexOf('a'), record.getCharacterPosition()); - - assertNotNull(record = parser.nextRecord()); - assertEquals(2, record.getRecordNumber()); - assertEquals(code.indexOf('1'), record.getCharacterPosition()); - - assertNotNull(record = parser.nextRecord()); - final long positionRecord3 = record.getCharacterPosition(); - assertEquals(3, record.getRecordNumber()); - assertEquals(code.indexOf("'A"), record.getCharacterPosition()); - assertEquals("A" + lineSeparator + "A", record.get(0)); - assertEquals("B" + lineSeparator + "B", record.get(1)); - assertEquals("CC", record.get(2)); - - assertNotNull(record = parser.nextRecord()); - assertEquals(4, record.getRecordNumber()); - assertEquals(code.indexOf('\u00c4'), record.getCharacterPosition()); - - assertNotNull(record = parser.nextRecord()); - assertEquals(5, record.getRecordNumber()); - assertEquals(code.indexOf("EOF"), record.getCharacterPosition()); - - parser.close(); - + final long positionRecord3; + try (CSVParser parser = CSVParser.parse(code, format)) { + CSVRecord record; + assertEquals(0, parser.getRecordNumber()); + // nextRecord + assertNotNull(record = parser.nextRecord()); + assertEquals(1, record.getRecordNumber()); + assertEquals(code.indexOf('a'), record.getCharacterPosition()); + // nextRecord + assertNotNull(record = parser.nextRecord()); + assertEquals(2, record.getRecordNumber()); + assertEquals(code.indexOf('1'), record.getCharacterPosition()); + // nextRecord + assertNotNull(record = parser.nextRecord()); + positionRecord3 = record.getCharacterPosition(); + assertEquals(3, record.getRecordNumber()); + assertEquals(code.indexOf("'A"), record.getCharacterPosition()); + assertEquals("A" + lineSeparator + "A", record.get(0)); + assertEquals("B" + lineSeparator + "B", record.get(1)); + assertEquals("CC", record.get(2)); + // nextRecord + assertNotNull(record = parser.nextRecord()); + assertEquals(4, record.getRecordNumber()); + assertEquals(code.indexOf('\u00c4'), record.getCharacterPosition()); + // nextRecord + assertNotNull(record = parser.nextRecord()); + assertEquals(5, record.getRecordNumber()); + assertEquals(code.indexOf("EOF"), record.getCharacterPosition()); + } // now try to read starting at record 3 - parser = new CSVParser(new StringReader(code.substring((int) positionRecord3)), format, positionRecord3, 3); - - assertNotNull(record = parser.nextRecord()); - assertEquals(3, record.getRecordNumber()); - assertEquals(code.indexOf("'A"), record.getCharacterPosition()); - assertEquals("A" + lineSeparator + "A", record.get(0)); - assertEquals("B" + lineSeparator + "B", record.get(1)); - assertEquals("CC", record.get(2)); - - assertNotNull(record = parser.nextRecord()); - assertEquals(4, record.getRecordNumber()); - assertEquals(code.indexOf('\u00c4'), record.getCharacterPosition()); - assertEquals("\u00c4", record.get(0)); - - parser.close(); + try (CSVParser parser = CSVParser.builder() + .setReader(new StringReader(code.substring((int) positionRecord3))) + .setFormat(format) + .setCharacterOffset(positionRecord3) + .setRecordNumber(3) + .get()) { + CSVRecord record; + // nextRecord + assertNotNull(record = parser.nextRecord()); + assertEquals(3, record.getRecordNumber()); + assertEquals(code.indexOf("'A"), record.getCharacterPosition()); + assertEquals("A" + lineSeparator + "A", record.get(0)); + assertEquals("B" + lineSeparator + "B", record.get(1)); + assertEquals("CC", record.get(2)); + // nextRecord + assertNotNull(record = parser.nextRecord()); + assertEquals(4, record.getRecordNumber()); + assertEquals(code.indexOf('\u00c4'), record.getCharacterPosition()); + assertEquals("\u00c4", record.get(0)); + } // again with ctor + try (CSVParser parser = new CSVParser(new StringReader(code.substring((int) positionRecord3)), format, positionRecord3, 3)) { + CSVRecord record; + // nextRecord + assertNotNull(record = parser.nextRecord()); + assertEquals(3, record.getRecordNumber()); + assertEquals(code.indexOf("'A"), record.getCharacterPosition()); + assertEquals("A" + lineSeparator + "A", record.get(0)); + assertEquals("B" + lineSeparator + "B", record.get(1)); + assertEquals("CC", record.get(2)); + // nextRecord + assertNotNull(record = parser.nextRecord()); + assertEquals(4, record.getRecordNumber()); + assertEquals(code.indexOf('\u00c4'), record.getCharacterPosition()); + assertEquals("\u00c4", record.get(0)); + } } } diff --git a/src/test/java/org/apache/commons/csv/CSVPrinterTest.java b/src/test/java/org/apache/commons/csv/CSVPrinterTest.java index 3ee2438f5d..9ae80c1e51 100644 --- a/src/test/java/org/apache/commons/csv/CSVPrinterTest.java +++ b/src/test/java/org/apache/commons/csv/CSVPrinterTest.java @@ -1,33 +1,51 @@ /* - * Licensed to the Apache Software Foundation (ASF) under one or more - * contributor license agreements. See the NOTICE file distributed with - * this work for additional information regarding copyright ownership. - * The ASF licenses this file to You under the Apache License, Version 2.0 - * (the "License"); you may not use this file except in compliance with - * the License. You may obtain a copy of the License at + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at * - * http://www.apache.org/licenses/LICENSE-2.0 + * https://www.apache.org/licenses/LICENSE-2.0 * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. */ package org.apache.commons.csv; +import static org.apache.commons.csv.Constants.BACKSLASH; import static org.apache.commons.csv.Constants.CR; -import static org.junit.Assert.assertArrayEquals; -import static org.junit.Assert.assertEquals; -import static org.junit.Assert.assertFalse; - +import static org.junit.jupiter.api.Assertions.assertArrayEquals; +import static org.junit.jupiter.api.Assertions.assertEquals; +import static org.junit.jupiter.api.Assertions.assertFalse; +import static org.junit.jupiter.api.Assertions.assertNotEquals; +import static org.junit.jupiter.api.Assertions.assertNull; +import static org.junit.jupiter.api.Assertions.assertThrows; +import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.never; +import static org.mockito.Mockito.times; +import static org.mockito.Mockito.verify; + +import java.io.CharArrayWriter; import java.io.File; +import java.io.FileReader; import java.io.IOException; +import java.io.PrintStream; +import java.io.Reader; import java.io.StringReader; import java.io.StringWriter; +import java.io.Writer; import java.nio.charset.Charset; import java.nio.charset.StandardCharsets; +import java.nio.file.Files; +import java.nio.file.Path; +import java.sql.BatchUpdateException; import java.sql.Connection; import java.sql.DriverManager; import java.sql.ResultSet; @@ -35,29 +53,37 @@ import java.sql.Statement; import java.util.Arrays; import java.util.Date; +import java.util.HashSet; import java.util.Iterator; import java.util.LinkedList; import java.util.List; import java.util.Objects; import java.util.Random; +import java.util.Vector; +import java.util.stream.Stream; import org.apache.commons.io.FileUtils; -import org.junit.Assert; -import org.junit.Ignore; -import org.junit.Test; +import org.apache.commons.io.IOUtils; +import org.apache.commons.io.output.NullOutputStream; +import org.apache.commons.lang3.StringUtils; +import org.h2.tools.SimpleResultSet; +import org.junit.jupiter.api.Disabled; +import org.junit.jupiter.api.Test; +import org.junit.jupiter.params.ParameterizedTest; +import org.junit.jupiter.params.provider.ValueSource; /** - * - * - * @version $Id$ + * Tests {@link CSVPrinter}. */ -public class CSVPrinterTest { +class CSVPrinterTest { - private static final char EURO_CH = '\u20AC'; + private static final int TABLE_RECORD_COUNT = 2; + private static final int TABLE_AND_HEADER_RECORD_COUNT = TABLE_RECORD_COUNT + 1; private static final char DQUOTE_CHAR = '"'; - private static final char BACKSLASH_CH = '\\'; + private static final char EURO_CH = '\u20AC'; + private static final int ITERATIONS_FOR_RANDOM_TEST = 50_000; private static final char QUOTE_CH = '\''; - private static final int ITERATIONS_FOR_RANDOM_TEST = 50000; + private static final String RECORD_SEPARATOR = CSVFormat.DEFAULT.getRecordSeparator(); private static String printable(final String s) { final StringBuilder sb = new StringBuilder(); @@ -72,7 +98,25 @@ private static String printable(final String s) { return sb.toString(); } - private final String recordSeparator = CSVFormat.DEFAULT.getRecordSeparator(); + private String longText2; + + private void assertInitialState(final CSVPrinter printer) { + assertEquals(0, printer.getRecordCount()); + } + + private void assertRowCount(final CSVFormat format, final String resultString, final int rowCount) throws IOException { + try (CSVParser parser = format.parse(new StringReader(resultString))) { + assertEquals(rowCount, parser.getRecords().size()); + } + } + + private File createTempFile() throws IOException { + return createTempPath().toFile(); + } + + private Path createTempPath() throws IOException { + return Files.createTempFile(getClass().getName(), ".csv"); + } private void doOneRandom(final CSVFormat format) throws Exception { final Random r = new Random(); @@ -83,7 +127,7 @@ private void doOneRandom(final CSVFormat format) throws Exception { final String[][] lines = generateLines(nLines, nCol); final StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, format)) { + try (CSVPrinter printer = new CSVPrinter(sw, format)) { for (int i = 0; i < nLines; i++) { // for (int j=0; j parseResult = parser.getRecords(); final String[][] expected = lines.clone(); for (int i = 0; i < expected.length; i++) { expected[i] = expectNulls(expected[i], format); } - Utils.compare("Printer output :" + printable(result), expected, parseResult); + Utils.compare("Printer output :" + printable(result), expected, parseResult, -1); } } @@ -113,8 +157,8 @@ private void doRandom(final CSVFormat format, final int iter) throws Exception { } /** - * Converts an input CSV array into expected output values WRT NULLs. NULL strings are converted to null values - * because the parser will convert these strings to null. + * Converts an input CSV array into expected output values, including NULLs. NULL strings are converted to null values because the parser will convert + * these strings to null. */ private T[] expectNulls(final T[] original, final CSVFormat csvFormat) { final T[] fixed = original.clone(); @@ -126,11 +170,6 @@ private T[] expectNulls(final T[] original, final CSVFormat csvFormat) { return fixed; } - private Connection geH2Connection() throws SQLException, ClassNotFoundException { - Class.forName("org.h2.Driver"); - return DriverManager.getConnection("jdbc:h2:mem:my_test;", "sa", ""); - } - private String[][] generateLines(final int nLines, final int nCol) { final String[][] lines = new String[nLines][]; for (int i = 0; i < nLines; i++) { @@ -143,29 +182,37 @@ private String[][] generateLines(final int nLines, final int nCol) { return lines; } - private CSVPrinter printWithHeaderComments(final StringWriter sw, final Date now, final CSVFormat baseFormat) - throws IOException { - CSVFormat format = baseFormat; + private Connection getH2Connection() throws SQLException, ClassNotFoundException { + Class.forName("org.h2.Driver"); + return DriverManager.getConnection("jdbc:h2:mem:my_test;", "sa", ""); + } + + private CSVPrinter printWithHeaderComments(final StringWriter sw, final Date now, final CSVFormat baseFormat) throws IOException { // Use withHeaderComments first to test CSV-145 - format = format.withHeaderComments("Generated by Apache Commons CSV 1.1", now); - format = format.withCommentMarker('#'); - format = format.withHeader("Col1", "Col2"); - final CSVPrinter csvPrinter = format.print(sw); - csvPrinter.printRecord("A", "B"); - csvPrinter.printRecord("C", "D"); - csvPrinter.close(); - return csvPrinter; + // @formatter:off + final CSVFormat format = baseFormat.builder() + .setHeaderComments((String[]) null) // don't blow up + .setHeaderComments((Object[]) null) // don't blow up + .setHeaderComments("Generated by Apache Commons CSV 1.1", now) + .setCommentMarker('#') + .setHeader("Col1", "Col2") + .get(); + // @formatter:on + final CSVPrinter printer = format.print(sw); + printer.printRecord("A", "B"); + printer.printRecord("C", "D"); + printer.close(); + return printer; } private String randStr() { final Random r = new Random(); - final int sz = r.nextInt(20); // sz = r.nextInt(3); final char[] buf = new char[sz]; for (int i = 0; i < sz; i++) { // stick in special chars with greater frequency - char ch; + final char ch; final int what = r.nextInt(20); switch (what) { case 0: @@ -193,7 +240,7 @@ private String randStr() { ch = '\''; break; case 8: - ch = BACKSLASH_CH; + ch = BACKSLASH; break; default: ch = (char) r.nextInt(300); @@ -206,17 +253,134 @@ private String randStr() { } private void setUpTable(final Connection connection) throws SQLException { - try (final Statement statement = connection.createStatement()) { - statement.execute("CREATE TABLE TEST(ID INT PRIMARY KEY, NAME VARCHAR(255))"); - statement.execute("insert into TEST values(1, 'r1')"); - statement.execute("insert into TEST values(2, 'r2')"); + try (Statement statement = connection.createStatement()) { + statement.execute("CREATE TABLE TEST(ID INT PRIMARY KEY, NAME VARCHAR(255), TEXT CLOB, BIN_DATA BLOB)"); + statement.execute("insert into TEST values(1, 'r1', 'long text 1', 'binary data 1')"); + longText2 = StringUtils.repeat('a', IOUtils.DEFAULT_BUFFER_SIZE - 4); + longText2 += "\"\r\n\"b\""; + longText2 += StringUtils.repeat('c', IOUtils.DEFAULT_BUFFER_SIZE - 1); + statement.execute("insert into TEST values(2, 'r2', '" + longText2 + "', 'binary data 2')"); + longText2 = longText2.replace("\"", "\"\""); + } + } + + @Test + void testCloseBackwardCompatibility() throws IOException { + try (Writer writer = mock(Writer.class)) { + final CSVFormat csvFormat = CSVFormat.DEFAULT; + try (CSVPrinter printer = new CSVPrinter(writer, csvFormat)) { + assertInitialState(printer); + } + verify(writer, never()).flush(); + verify(writer, times(1)).close(); + } + } + + @Test + void testCloseWithCsvFormatAutoFlushOff() throws IOException { + try (Writer writer = mock(Writer.class)) { + final CSVFormat csvFormat = CSVFormat.DEFAULT.withAutoFlush(false); + try (CSVPrinter printer = new CSVPrinter(writer, csvFormat)) { + assertInitialState(printer); + } + verify(writer, never()).flush(); + verify(writer, times(1)).close(); + } + } + + @Test + void testCloseWithCsvFormatAutoFlushOn() throws IOException { + // System.out.println("start method"); + try (Writer writer = mock(Writer.class)) { + final CSVFormat csvFormat = CSVFormat.DEFAULT.withAutoFlush(true); + try (CSVPrinter printer = new CSVPrinter(writer, csvFormat)) { + assertInitialState(printer); + } + verify(writer, times(1)).flush(); + verify(writer, times(1)).close(); + } + } + + @Test + void testCloseWithFlushOff() throws IOException { + try (Writer writer = mock(Writer.class)) { + final CSVFormat csvFormat = CSVFormat.DEFAULT; + @SuppressWarnings("resource") + final CSVPrinter printer = new CSVPrinter(writer, csvFormat); + assertInitialState(printer); + printer.close(false); + assertEquals(0, printer.getRecordCount()); + verify(writer, never()).flush(); + verify(writer, times(1)).close(); + } + } + + @Test + void testCloseWithFlushOn() throws IOException { + try (Writer writer = mock(Writer.class)) { + @SuppressWarnings("resource") + final CSVPrinter printer = new CSVPrinter(writer, CSVFormat.DEFAULT); + assertInitialState(printer); + printer.close(true); + assertEquals(0, printer.getRecordCount()); + verify(writer, times(1)).flush(); + } + } + + @Test + void testCRComment() throws IOException { + final StringWriter sw = new StringWriter(); + final Object value = "abc"; + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withCommentMarker('#'))) { + assertInitialState(printer); + printer.print(value); + assertEquals(0, printer.getRecordCount()); + printer.printComment("This is a comment\r\non multiple lines\rthis is next comment\r"); + assertEquals("abc" + RECORD_SEPARATOR + "# This is a comment" + RECORD_SEPARATOR + "# on multiple lines" + RECORD_SEPARATOR + + "# this is next comment" + RECORD_SEPARATOR + "# " + RECORD_SEPARATOR, sw.toString()); + assertEquals(0, printer.getRecordCount()); + } + } + + @Test + void testCSV135() throws IOException { + final List list = new LinkedList<>(); + list.add("\"\""); // "" + list.add("\\\\"); // \\ + list.add("\\\"\\"); // \"\ + // + // "",\\,\"\ (unchanged) + tryFormat(list, null, null, "\"\",\\\\,\\\"\\"); + // + // """""",\\,"\""\" (quoted, and embedded DQ doubled) + tryFormat(list, '"', null, "\"\"\"\"\"\",\\\\,\"\\\"\"\\\""); + // + // "",\\\\,\\"\\ (escapes escaped, not quoted) + tryFormat(list, null, '\\', "\"\",\\\\\\\\,\\\\\"\\\\"); + // + // "\"\"","\\\\","\\\"\\" (quoted, and embedded DQ & escape escaped) + tryFormat(list, '"', '\\', "\"\\\"\\\"\",\"\\\\\\\\\",\"\\\\\\\"\\\\\""); + // + // """""",\\,"\""\" (quoted, embedded DQ escaped) + tryFormat(list, '"', '"', "\"\"\"\"\"\",\\\\,\"\\\"\"\\\""); + } + + @Test + void testCSV259() throws IOException { + final StringWriter sw = new StringWriter(); + try (Reader reader = new FileReader("src/test/resources/org/apache/commons/csv/CSV-259/sample.txt"); + CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withEscape('!').withQuote(null))) { + assertInitialState(printer); + printer.print(reader); + assertEquals("x!,y!,z", sw.toString()); } } @Test - public void testDelimeterQuoted() throws IOException { + void testDelimeterQuoted() throws IOException { final StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withQuote('\''))) { + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withQuote('\''))) { + assertInitialState(printer); printer.print("a,b,c"); printer.print("xyz"); assertEquals("'a,b,c',xyz", sw.toString()); @@ -224,10 +388,11 @@ public void testDelimeterQuoted() throws IOException { } @Test - public void testDelimeterQuoteNONE() throws IOException { + void testDelimeterQuoteNone() throws IOException { final StringWriter sw = new StringWriter(); final CSVFormat format = CSVFormat.DEFAULT.withEscape('!').withQuoteMode(QuoteMode.NONE); - try (final CSVPrinter printer = new CSVPrinter(sw, format)) { + try (CSVPrinter printer = new CSVPrinter(sw, format)) { + assertInitialState(printer); printer.print("a,b,c"); printer.print("xyz"); assertEquals("a!,b!,c,xyz", sw.toString()); @@ -235,9 +400,34 @@ public void testDelimeterQuoteNONE() throws IOException { } @Test - public void testDelimiterEscaped() throws IOException { + void testDelimeterStringQuoted() throws IOException { final StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withEscape('!').withQuote(null))) { + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.builder().setDelimiter("[|]").setQuote('\'').get())) { + assertInitialState(printer); + printer.print("a[|]b[|]c"); + printer.print("xyz"); + assertEquals("'a[|]b[|]c'[|]xyz", sw.toString()); + } + } + + @Test + void testDelimeterStringQuoteNone() throws IOException { + final StringWriter sw = new StringWriter(); + final CSVFormat format = CSVFormat.DEFAULT.builder().setDelimiter("[|]").setEscape('!').setQuoteMode(QuoteMode.NONE).get(); + try (CSVPrinter printer = new CSVPrinter(sw, format)) { + assertInitialState(printer); + printer.print("a[|]b[|]c"); + printer.print("xyz"); + printer.print("a[xy]bc[]"); + assertEquals("a![!|!]b![!|!]c[|]xyz[|]a[xy]bc[]", sw.toString()); + } + } + + @Test + void testDelimiterEscaped() throws IOException { + final StringWriter sw = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withEscape('!').withQuote(null))) { + assertInitialState(printer); printer.print("a,b,c"); printer.print("xyz"); assertEquals("a!,b!,c,xyz", sw.toString()); @@ -245,9 +435,10 @@ public void testDelimiterEscaped() throws IOException { } @Test - public void testDelimiterPlain() throws IOException { + void testDelimiterPlain() throws IOException { final StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withQuote(null))) { + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withQuote(null))) { + assertInitialState(printer); printer.print("a,b,c"); printer.print("xyz"); assertEquals("a,b,c,xyz", sw.toString()); @@ -255,18 +446,42 @@ public void testDelimiterPlain() throws IOException { } @Test - public void testDisabledComment() throws IOException { + void testDelimiterStringEscaped() throws IOException { final StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT)) { + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.builder().setDelimiter("|||").setEscape('!').setQuote(null).get())) { + assertInitialState(printer); + printer.print("a|||b|||c"); + printer.print("xyz"); + assertEquals("a!|!|!|b!|!|!|c|||xyz", sw.toString()); + } + } + + @Test + void testDisabledComment() throws IOException { + final StringWriter sw = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT)) { + assertInitialState(printer); printer.printComment("This is a comment"); assertEquals("", sw.toString()); + assertEquals(0, printer.getRecordCount()); + } + } + + @Test + void testDontQuoteEuroFirstChar() throws IOException { + final StringWriter sw = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.RFC4180)) { + assertInitialState(printer); + printer.printRecord(EURO_CH, "Deux"); + assertEquals(EURO_CH + ",Deux" + RECORD_SEPARATOR, sw.toString()); } } @Test - public void testEOLEscaped() throws IOException { + void testEolEscaped() throws IOException { final StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withQuote(null).withEscape('!'))) { + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withQuote(null).withEscape('!'))) { + assertInitialState(printer); printer.print("a\rb\nc"); printer.print("x\fy\bz"); assertEquals("a!rb!nc,x\fy\bz", sw.toString()); @@ -274,9 +489,10 @@ public void testEOLEscaped() throws IOException { } @Test - public void testEOLPlain() throws IOException { + void testEolPlain() throws IOException { final StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withQuote(null))) { + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withQuote(null))) { + assertInitialState(printer); printer.print("a\rb\nc"); printer.print("x\fy\bz"); assertEquals("a\rb\nc,x\fy\bz", sw.toString()); @@ -284,166 +500,307 @@ public void testEOLPlain() throws IOException { } @Test - public void testEOLQuoted() throws IOException { + void testEolQuoted() throws IOException { final StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withQuote('\''))) { + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withQuote('\''))) { + assertInitialState(printer); printer.print("a\rb\nc"); printer.print("x\by\fz"); assertEquals("'a\rb\nc',x\by\fz", sw.toString()); } } + @SuppressWarnings("unlikely-arg-type") @Test - public void testEscapeBackslash1() throws IOException { - StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withQuote(QUOTE_CH))) { + void testEquals() throws IOException { + // Don't use assertNotEquals here + assertFalse(CSVFormat.DEFAULT.equals(null)); + // Don't use assertNotEquals here + assertFalse(CSVFormat.DEFAULT.equals("")); + } + + @Test + void testEscapeBackslash1() throws IOException { + final StringWriter sw = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withQuote(QUOTE_CH))) { + assertInitialState(printer); printer.print("\\"); } assertEquals("\\", sw.toString()); } @Test - public void testEscapeBackslash2() throws IOException { - StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withQuote(QUOTE_CH))) { + void testEscapeBackslash2() throws IOException { + final StringWriter sw = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withQuote(QUOTE_CH))) { + assertInitialState(printer); printer.print("\\\r"); } assertEquals("'\\\r'", sw.toString()); } @Test - public void testEscapeBackslash3() throws IOException { - StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withQuote(QUOTE_CH))) { + void testEscapeBackslash3() throws IOException { + final StringWriter sw = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withQuote(QUOTE_CH))) { + assertInitialState(printer); printer.print("X\\\r"); } assertEquals("'X\\\r'", sw.toString()); } @Test - public void testEscapeBackslash4() throws IOException { - StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withQuote(QUOTE_CH))) { + void testEscapeBackslash4() throws IOException { + final StringWriter sw = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withQuote(QUOTE_CH))) { + assertInitialState(printer); printer.print("\\\\"); } assertEquals("\\\\", sw.toString()); } @Test - public void testEscapeBackslash5() throws IOException { - StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withQuote(QUOTE_CH))) { + void testEscapeBackslash5() throws IOException { + final StringWriter sw = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withQuote(QUOTE_CH))) { + assertInitialState(printer); printer.print("\\\\"); } assertEquals("\\\\", sw.toString()); } @Test - public void testEscapeNull1() throws IOException { - StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withEscape(null))) { + void testEscapeCommentMarkerFirstChar() throws IOException { + // No quoting available in escape mode, so a leading comment marker must be escaped or the + // record reads back as a comment and is dropped. Mirrors the quoting fix for QuoteMode.MINIMAL. + final CSVFormat format = CSVFormat.DEFAULT.builder().setQuote(null).setEscape('\\').setCommentMarker(';').get(); + final StringWriter sw = new StringWriter(); + final String col1 = ";comment-like"; + try (CSVPrinter printer = new CSVPrinter(sw, format)) { + printer.printRecord(col1, "b"); + printer.printRecord(new StringReader(col1), new StringReader("b")); + // The marker past the first character does not start a comment and is left alone. + printer.printRecord("a;b", ";c"); + } + final String string = sw.toString(); + assertEquals("\\;comment-like,b" + RECORD_SEPARATOR + + "\\;comment-like,b" + RECORD_SEPARATOR + + "a;b,\\;c" + RECORD_SEPARATOR, string); + // The emitted records must read back as the original values, none parsed as a comment. + try (CSVParser parser = CSVParser.parse(string, format)) { + final List records = parser.getRecords(); + assertEquals(3, records.size()); + assertEquals(col1, records.get(0).get(0)); + assertEquals("b", records.get(0).get(1)); + assertEquals(col1, records.get(1).get(0)); + assertEquals("b", records.get(1).get(1)); + assertEquals("a;b", records.get(2).get(0)); + assertEquals(";c", records.get(2).get(1)); + } + } + + @Test + void testEscapeCommentMarkerFirstCharWithQuoteModeNone() throws IOException { + final CSVFormat format = CSVFormat.DEFAULT.builder().setEscape('\\').setQuoteMode(QuoteMode.NONE).setCommentMarker(';').get(); + final StringWriter sw = new StringWriter(); + final String col1 = ";bar"; + try (CSVPrinter printer = new CSVPrinter(sw, format)) { + printer.printRecord(col1, "b"); + printer.printRecord(new StringReader(col1), new StringReader("b")); + } + final String string = sw.toString(); + assertEquals("\\;bar,b" + RECORD_SEPARATOR + "\\;bar,b" + RECORD_SEPARATOR, string); + try (CSVParser parser = CSVParser.parse(string, format)) { + final List records = parser.getRecords(); + assertEquals(2, records.size()); + for (final CSVRecord record : records) { + assertEquals(col1, record.get(0)); + assertEquals("b", record.get(1)); + } + } + } + + @Test + void testEscapeNull1() throws IOException { + final StringWriter sw = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withEscape(null))) { + assertInitialState(printer); printer.print("\\"); } assertEquals("\\", sw.toString()); } @Test - public void testEscapeNull2() throws IOException { - StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withEscape(null))) { + void testEscapeNull2() throws IOException { + final StringWriter sw = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withEscape(null))) { + assertInitialState(printer); printer.print("\\\r"); } assertEquals("\"\\\r\"", sw.toString()); } @Test - public void testEscapeNull3() throws IOException { - StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withEscape(null))) { + void testEscapeNull3() throws IOException { + final StringWriter sw = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withEscape(null))) { + assertInitialState(printer); printer.print("X\\\r"); } assertEquals("\"X\\\r\"", sw.toString()); } @Test - public void testEscapeNull4() throws IOException { - StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withEscape(null))) { + void testEscapeNull4() throws IOException { + final StringWriter sw = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withEscape(null))) { + assertInitialState(printer); printer.print("\\\\"); } assertEquals("\\\\", sw.toString()); } @Test - public void testEscapeNull5() throws IOException { - StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withEscape(null))) { + void testEscapeNull5() throws IOException { + final StringWriter sw = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withEscape(null))) { + assertInitialState(printer); printer.print("\\\\"); } assertEquals("\\\\", sw.toString()); } @Test - public void testExcelPrintAllArrayOfArrays() throws IOException { + void testExcelPrintAllArrayOfArrays() throws IOException { final StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.EXCEL)) { + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.EXCEL)) { + assertInitialState(printer); printer.printRecords((Object[]) new String[][] { { "r1c1", "r1c2" }, { "r2c1", "r2c2" } }); - assertEquals("r1c1,r1c2" + recordSeparator + "r2c1,r2c2" + recordSeparator, sw.toString()); + assertEquals("r1c1,r1c2" + RECORD_SEPARATOR + "r2c1,r2c2" + RECORD_SEPARATOR, sw.toString()); + } + } + + @Test + void testExcelPrintAllArrayOfArraysWithFirstEmptyValue2() throws IOException { + final StringWriter sw = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.EXCEL)) { + assertInitialState(printer); + printer.printRecords((Object[]) new String[][] { { "" } }); + assertEquals("\"\"" + RECORD_SEPARATOR, sw.toString()); + } + } + + @Test + void testExcelPrintAllArrayOfArraysWithFirstSpaceValue1() throws IOException { + final StringWriter sw = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.EXCEL)) { + assertInitialState(printer); + printer.printRecords((Object[]) new String[][] { { " ", "r1c2" } }); + assertEquals("\" \",r1c2" + RECORD_SEPARATOR, sw.toString()); } } @Test - public void testExcelPrintAllArrayOfLists() throws IOException { + void testExcelPrintAllArrayOfArraysWithFirstTabValue1() throws IOException { final StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.EXCEL)) { - printer.printRecords( - (Object[]) new List[] { Arrays.asList("r1c1", "r1c2"), Arrays.asList("r2c1", "r2c2") }); - assertEquals("r1c1,r1c2" + recordSeparator + "r2c1,r2c2" + recordSeparator, sw.toString()); + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.EXCEL)) { + assertInitialState(printer); + printer.printRecords((Object[]) new String[][] { { "\t", "r1c2" } }); + assertEquals("\"\t\",r1c2" + RECORD_SEPARATOR, sw.toString()); } } @Test - public void testExcelPrintAllIterableOfArrays() throws IOException { + void testExcelPrintAllArrayOfLists() throws IOException { final StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.EXCEL)) { + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.EXCEL)) { + assertInitialState(printer); + printer.printRecords((Object[]) new List[] { Arrays.asList("r1c1", "r1c2"), Arrays.asList("r2c1", "r2c2") }); + assertEquals("r1c1,r1c2" + RECORD_SEPARATOR + "r2c1,r2c2" + RECORD_SEPARATOR, sw.toString()); + } + } + + @Test + void testExcelPrintAllArrayOfListsWithFirstEmptyValue2() throws IOException { + final StringWriter sw = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.EXCEL)) { + assertInitialState(printer); + printer.printRecords((Object[]) new List[] { Arrays.asList("") }); + assertEquals("\"\"" + RECORD_SEPARATOR, sw.toString()); + } + } + + @Test + void testExcelPrintAllIterableOfArrays() throws IOException { + final StringWriter sw = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.EXCEL)) { + assertInitialState(printer); printer.printRecords(Arrays.asList(new String[][] { { "r1c1", "r1c2" }, { "r2c1", "r2c2" } })); - assertEquals("r1c1,r1c2" + recordSeparator + "r2c1,r2c2" + recordSeparator, sw.toString()); + assertEquals("r1c1,r1c2" + RECORD_SEPARATOR + "r2c1,r2c2" + RECORD_SEPARATOR, sw.toString()); } } @Test - public void testExcelPrintAllIterableOfLists() throws IOException { + void testExcelPrintAllIterableOfArraysWithFirstEmptyValue2() throws IOException { final StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.EXCEL)) { - printer.printRecords( - Arrays.asList(new List[] { Arrays.asList("r1c1", "r1c2"), Arrays.asList("r2c1", "r2c2") })); - assertEquals("r1c1,r1c2" + recordSeparator + "r2c1,r2c2" + recordSeparator, sw.toString()); + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.EXCEL)) { + assertInitialState(printer); + printer.printRecords(Arrays.asList(new String[][] { { "" } })); + assertEquals("\"\"" + RECORD_SEPARATOR, sw.toString()); } } @Test - public void testExcelPrinter1() throws IOException { + void testExcelPrintAllIterableOfLists() throws IOException { final StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.EXCEL)) { + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.EXCEL)) { + assertInitialState(printer); + printer.printRecords(Arrays.asList(Arrays.asList("r1c1", "r1c2"), Arrays.asList("r2c1", "r2c2"))); + assertEquals("r1c1,r1c2" + RECORD_SEPARATOR + "r2c1,r2c2" + RECORD_SEPARATOR, sw.toString()); + } + } + + @ParameterizedTest + @ValueSource(longs = { -1, 0, 1, 2, Long.MAX_VALUE }) + void testExcelPrintAllStreamOfArrays(final long maxRows) throws IOException { + final StringWriter sw = new StringWriter(); + final CSVFormat format = CSVFormat.EXCEL.builder().setMaxRows(maxRows).get(); + try (CSVPrinter printer = new CSVPrinter(sw, format)) { + assertInitialState(printer); + printer.printRecords(Stream.of(new String[][] { { "r1c1", "r1c2" }, { "r2c1", "r2c2" } })); + String expected = "r1c1,r1c2" + RECORD_SEPARATOR; + if (maxRows != 1) { + expected += "r2c1,r2c2" + RECORD_SEPARATOR; + } + assertEquals(expected, sw.toString()); + } + } + + @Test + void testExcelPrinter1() throws IOException { + final StringWriter sw = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.EXCEL)) { + assertInitialState(printer); printer.printRecord("a", "b"); - assertEquals("a,b" + recordSeparator, sw.toString()); + assertEquals("a,b" + RECORD_SEPARATOR, sw.toString()); } } @Test - public void testExcelPrinter2() throws IOException { + void testExcelPrinter2() throws IOException { final StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.EXCEL)) { + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.EXCEL)) { + assertInitialState(printer); printer.printRecord("a,b", "b"); - assertEquals("\"a,b\",b" + recordSeparator, sw.toString()); + assertEquals("\"a,b\",b" + RECORD_SEPARATOR, sw.toString()); } } @Test - public void testHeader() throws IOException { + void testHeader() throws IOException { final StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, - CSVFormat.DEFAULT.withQuote(null).withHeader("C1", "C2", "C3"))) { + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withQuote(null).withHeader("C1", "C2", "C3"))) { + assertEquals(1, printer.getRecordCount()); printer.printRecord("a", "b", "c"); printer.printRecord("x", "y", "z"); assertEquals("C1,C2,C3\r\na,b,c\r\nx,y,z\r\n", sw.toString()); @@ -451,96 +808,168 @@ public void testHeader() throws IOException { } @Test - public void testHeaderCommentExcel() throws IOException { + void testHeaderCommentExcel() throws IOException { final StringWriter sw = new StringWriter(); final Date now = new Date(); final CSVFormat format = CSVFormat.EXCEL; - try (final CSVPrinter csvPrinter = printWithHeaderComments(sw, now, format)) { - assertEquals("# Generated by Apache Commons CSV 1.1\r\n# " + now + "\r\nCol1,Col2\r\nA,B\r\nC,D\r\n", - sw.toString()); + try (CSVPrinter csvPrinter = printWithHeaderComments(sw, now, format)) { + assertEquals("# Generated by Apache Commons CSV 1.1\r\n# " + now + "\r\nCol1,Col2\r\nA,B\r\nC,D\r\n", sw.toString()); } } @Test - public void testHeaderCommentTdf() throws IOException { + void testHeaderCommentTdf() throws IOException { final StringWriter sw = new StringWriter(); final Date now = new Date(); final CSVFormat format = CSVFormat.TDF; - try (final CSVPrinter csvPrinter = printWithHeaderComments(sw, now, format)) { - assertEquals("# Generated by Apache Commons CSV 1.1\r\n# " + now + "\r\nCol1\tCol2\r\nA\tB\r\nC\tD\r\n", - sw.toString()); + try (CSVPrinter csvPrinter = printWithHeaderComments(sw, now, format)) { + assertEquals("# Generated by Apache Commons CSV 1.1\r\n# " + now + "\r\nCol1\tCol2\r\nA\tB\r\nC\tD\r\n", sw.toString()); } } @Test - public void testHeaderNotSet() throws IOException { + void testHeaderNotSet() throws IOException { final StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withQuote(null))) { + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withQuote(null))) { + assertInitialState(printer); printer.printRecord("a", "b", "c"); printer.printRecord("x", "y", "z"); assertEquals("a,b,c\r\nx,y,z\r\n", sw.toString()); } } - @Test(expected = IllegalArgumentException.class) - public void testInvalidFormat() throws Exception { - final CSVFormat invalidFormat = CSVFormat.DEFAULT.withDelimiter(CR); - try (final CSVPrinter printer = new CSVPrinter(new StringWriter(), invalidFormat)) { - Assert.fail("This test should have thrown an exception."); - } + @Test + void testInvalidFormat() { + assertThrows(IllegalArgumentException.class, () -> CSVFormat.DEFAULT.withDelimiter(CR)); } @Test - public void testJdbcPrinter() throws IOException, ClassNotFoundException, SQLException { + void testJdbcPrinter() throws IOException, ClassNotFoundException, SQLException { final StringWriter sw = new StringWriter(); - try (final Connection connection = geH2Connection()) { + final CSVFormat csvFormat = CSVFormat.DEFAULT; + try (Connection connection = getH2Connection()) { setUpTable(connection); - try (final Statement stmt = connection.createStatement(); - final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT)) { - printer.printRecords(stmt.executeQuery("select ID, NAME from TEST")); + try (Statement stmt = connection.createStatement(); + CSVPrinter printer = new CSVPrinter(sw, csvFormat); + ResultSet resultSet = stmt.executeQuery("select ID, NAME, TEXT, BIN_DATA from TEST")) { + assertInitialState(printer); + printer.printRecords(resultSet); + assertEquals(TABLE_RECORD_COUNT, printer.getRecordCount()); } } - assertEquals("1,r1" + recordSeparator + "2,r2" + recordSeparator, sw.toString()); + final String csv = sw.toString(); + assertEquals("1,r1,\"long text 1\",\"YmluYXJ5IGRhdGEgMQ==\"" + RECORD_SEPARATOR + "2,r2,\"" + longText2 + "\",\"YmluYXJ5IGRhdGEgMg==\"" + + RECORD_SEPARATOR, csv); + // Round trip the data + try (StringReader reader = new StringReader(csv); + CSVParser csvParser = csvFormat.parse(reader)) { + // Row 1 + CSVRecord record = csvParser.nextRecord(); + assertEquals("1", record.get(0)); + assertEquals("r1", record.get(1)); + assertEquals("long text 1", record.get(2)); + assertEquals("YmluYXJ5IGRhdGEgMQ==", record.get(3)); + // Row 2 + record = csvParser.nextRecord(); + assertEquals("2", record.get(0)); + assertEquals("r2", record.get(1)); + assertEquals("YmluYXJ5IGRhdGEgMg==", record.get(3)); + } } @Test - public void testJdbcPrinterWithResultSet() throws IOException, ClassNotFoundException, SQLException { + void testJdbcPrinterWithFirstEmptyValue2() throws IOException, ClassNotFoundException, SQLException { final StringWriter sw = new StringWriter(); - Class.forName("org.h2.Driver"); - try (final Connection connection = geH2Connection();) { + try (Connection connection = getH2Connection()) { + try (Statement stmt = connection.createStatement(); + ResultSet resultSet = stmt.executeQuery("select '' AS EMPTYVALUE from DUAL"); + CSVPrinter printer = CSVFormat.DEFAULT.withHeader(resultSet).print(sw)) { + printer.printRecords(resultSet); + } + } + assertEquals("EMPTYVALUE" + RECORD_SEPARATOR + "\"\"" + RECORD_SEPARATOR, sw.toString()); + } + + @ParameterizedTest + @ValueSource(longs = { -1, 0, 1, 2, 3, 4, Long.MAX_VALUE }) + void testJdbcPrinterWithResultSet(final long maxRows) throws IOException, ClassNotFoundException, SQLException { + final StringWriter sw = new StringWriter(); + final CSVFormat format = CSVFormat.DEFAULT.builder().setMaxRows(maxRows).get(); + try (Connection connection = getH2Connection()) { setUpTable(connection); - try (final Statement stmt = connection.createStatement(); - final ResultSet resultSet = stmt.executeQuery("select ID, NAME from TEST"); - final CSVPrinter printer = CSVFormat.DEFAULT.withHeader(resultSet).print(sw)) { + try (Statement stmt = connection.createStatement(); + ResultSet resultSet = stmt.executeQuery("select ID, NAME, TEXT from TEST"); + CSVPrinter printer = format.withHeader(resultSet).print(sw)) { printer.printRecords(resultSet); } } - assertEquals("ID,NAME" + recordSeparator + "1,r1" + recordSeparator + "2,r2" + recordSeparator, sw.toString()); + final String resultString = sw.toString(); + final String header = "ID,NAME,TEXT"; + final String headerRow1 = header + RECORD_SEPARATOR + "1,r1,\"long text 1\"" + RECORD_SEPARATOR; + final String allRows = headerRow1 + "2,r2,\"" + longText2 + "\"" + RECORD_SEPARATOR; + final int expectedRowsWithHeader; + if (maxRows == 1) { + assertEquals(headerRow1, resultString); + expectedRowsWithHeader = 2; + } else { + assertEquals(allRows, resultString); + expectedRowsWithHeader = TABLE_AND_HEADER_RECORD_COUNT; + } + assertRowCount(CSVFormat.DEFAULT, resultString, expectedRowsWithHeader); } - @Test - public void testJdbcPrinterWithResultSetMetaData() throws IOException, ClassNotFoundException, SQLException { + @ParameterizedTest + @ValueSource(longs = { -1, 0, 3, 4, Long.MAX_VALUE }) + void testJdbcPrinterWithResultSetHeader(final long maxRows) throws IOException, ClassNotFoundException, SQLException { final StringWriter sw = new StringWriter(); - Class.forName("org.h2.Driver"); - try (final Connection connection = geH2Connection()) { + try (Connection connection = getH2Connection()) { + setUpTable(connection); + final CSVFormat format = CSVFormat.DEFAULT.builder().setMaxRows(maxRows).get(); + try (Statement stmt = connection.createStatement(); + CSVPrinter printer = new CSVPrinter(sw, format)) { + try (ResultSet resultSet = stmt.executeQuery("select ID, NAME from TEST")) { + printer.printRecords(resultSet, true); + assertEquals(TABLE_RECORD_COUNT, printer.getRecordCount()); + assertEquals("ID,NAME" + RECORD_SEPARATOR + "1,r1" + RECORD_SEPARATOR + "2,r2" + RECORD_SEPARATOR, sw.toString()); + } + assertRowCount(format, sw.toString(), TABLE_AND_HEADER_RECORD_COUNT); + try (ResultSet resultSet = stmt.executeQuery("select ID, NAME from TEST")) { + printer.printRecords(resultSet, false); + assertEquals(TABLE_RECORD_COUNT * 2, printer.getRecordCount()); + assertNotEquals("ID,NAME" + RECORD_SEPARATOR + "1,r1" + RECORD_SEPARATOR + "2,r2" + RECORD_SEPARATOR, sw.toString()); + } + assertRowCount(CSVFormat.DEFAULT, sw.toString(), TABLE_AND_HEADER_RECORD_COUNT + TABLE_RECORD_COUNT); + } + } + } + + @ParameterizedTest + @ValueSource(longs = { -1, 0, 3, 4, Long.MAX_VALUE }) + void testJdbcPrinterWithResultSetMetaData(final long maxRows) throws IOException, ClassNotFoundException, SQLException { + final StringWriter sw = new StringWriter(); + try (Connection connection = getH2Connection()) { setUpTable(connection); - try (final Statement stmt = connection.createStatement(); - final ResultSet resultSet = stmt.executeQuery("select ID, NAME from TEST"); - final CSVPrinter printer = CSVFormat.DEFAULT.withHeader(resultSet.getMetaData()).print(sw)) { + final CSVFormat format = CSVFormat.DEFAULT.builder().setMaxRows(maxRows).get(); + try (Statement stmt = connection.createStatement(); + ResultSet resultSet = stmt.executeQuery("select ID, NAME, TEXT from TEST"); + CSVPrinter printer = format.withHeader(resultSet.getMetaData()).print(sw)) { + // The header is the first record. + assertEquals(1, printer.getRecordCount()); printer.printRecords(resultSet); - assertEquals("ID,NAME" + recordSeparator + "1,r1" + recordSeparator + "2,r2" + recordSeparator, + assertEquals(3, printer.getRecordCount()); + assertEquals("ID,NAME,TEXT" + RECORD_SEPARATOR + "1,r1,\"long text 1\"" + RECORD_SEPARATOR + "2,r2,\"" + longText2 + "\"" + RECORD_SEPARATOR, sw.toString()); } + assertRowCount(format, sw.toString(), TABLE_AND_HEADER_RECORD_COUNT); } } @Test - @Ignore - public void testJira135_part1() throws IOException { - final CSVFormat format = CSVFormat.DEFAULT.withRecordSeparator('\n').withQuote(DQUOTE_CHAR).withEscape(BACKSLASH_CH); + void testJira135_part1() throws IOException { + final CSVFormat format = CSVFormat.DEFAULT.withRecordSeparator('\n').withQuote(DQUOTE_CHAR).withEscape(BACKSLASH); final StringWriter sw = new StringWriter(); final List list = new LinkedList<>(); - try (final CSVPrinter printer = new CSVPrinter(sw, format)) { + try (CSVPrinter printer = new CSVPrinter(sw, format)) { list.add("\""); printer.printRecord(list); } @@ -551,12 +980,12 @@ public void testJira135_part1() throws IOException { } @Test - @Ignore - public void testJira135_part2() throws IOException { - final CSVFormat format = CSVFormat.DEFAULT.withRecordSeparator('\n').withQuote(DQUOTE_CHAR).withEscape(BACKSLASH_CH); + @Disabled + void testJira135_part2() throws IOException { + final CSVFormat format = CSVFormat.DEFAULT.withRecordSeparator('\n').withQuote(DQUOTE_CHAR).withEscape(BACKSLASH); final StringWriter sw = new StringWriter(); final List list = new LinkedList<>(); - try (final CSVPrinter printer = new CSVPrinter(sw, format)) { + try (CSVPrinter printer = new CSVPrinter(sw, format)) { list.add("\n"); printer.printRecord(list); } @@ -567,12 +996,11 @@ public void testJira135_part2() throws IOException { } @Test - @Ignore - public void testJira135_part3() throws IOException { - final CSVFormat format = CSVFormat.DEFAULT.withRecordSeparator('\n').withQuote(DQUOTE_CHAR).withEscape(BACKSLASH_CH); + void testJira135_part3() throws IOException { + final CSVFormat format = CSVFormat.DEFAULT.withRecordSeparator('\n').withQuote(DQUOTE_CHAR).withEscape(BACKSLASH); final StringWriter sw = new StringWriter(); final List list = new LinkedList<>(); - try (final CSVPrinter printer = new CSVPrinter(sw, format)) { + try (CSVPrinter printer = new CSVPrinter(sw, format)) { list.add("\\"); printer.printRecord(list); } @@ -583,12 +1011,12 @@ public void testJira135_part3() throws IOException { } @Test - @Ignore - public void testJira135All() throws IOException { - final CSVFormat format = CSVFormat.DEFAULT.withRecordSeparator('\n').withQuote(DQUOTE_CHAR).withEscape(BACKSLASH_CH); + @Disabled + void testJira135All() throws IOException { + final CSVFormat format = CSVFormat.DEFAULT.withRecordSeparator('\n').withQuote(DQUOTE_CHAR).withEscape(BACKSLASH); final StringWriter sw = new StringWriter(); final List list = new LinkedList<>(); - try (final CSVPrinter printer = new CSVPrinter(sw, format)) { + try (CSVPrinter printer = new CSVPrinter(sw, format)) { list.add("\""); list.add("\n"); list.add("\\"); @@ -601,33 +1029,101 @@ public void testJira135All() throws IOException { } @Test - public void testMultiLineComment() throws IOException { + void testMongoDbCsvBasic() throws IOException { final StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withCommentMarker('#'))) { - printer.printComment("This is a comment\non multiple lines"); + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.MONGODB_CSV)) { + printer.printRecord("a", "b"); + assertEquals("a,b" + RECORD_SEPARATOR, sw.toString()); + assertEquals(1, printer.getRecordCount()); + } + } + + @Test + void testMongoDbCsvCommaInValue() throws IOException { + final StringWriter sw = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.MONGODB_CSV)) { + printer.printRecord("a,b", "c"); + assertEquals("\"a,b\",c" + RECORD_SEPARATOR, sw.toString()); + assertEquals(1, printer.getRecordCount()); + } + } + + @Test + void testMongoDbCsvDoubleQuoteInValue() throws IOException { + final StringWriter sw = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.MONGODB_CSV)) { + printer.printRecord("a \"c\" b", "d"); + assertEquals("\"a \"\"c\"\" b\",d" + RECORD_SEPARATOR, sw.toString()); + assertEquals(1, printer.getRecordCount()); + } + } - assertEquals("# This is a comment" + recordSeparator + "# on multiple lines" + recordSeparator, - sw.toString()); + @Test + void testMongoDbCsvTabInValue() throws IOException { + final StringWriter sw = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.MONGODB_CSV)) { + printer.printRecord("a\tb", "c"); + assertEquals("a\tb,c" + RECORD_SEPARATOR, sw.toString()); + assertEquals(1, printer.getRecordCount()); } } @Test - public void testMySqlNullOutput() throws IOException { + void testMongoDbTsvBasic() throws IOException { + final StringWriter sw = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.MONGODB_TSV)) { + printer.printRecord("a", "b"); + assertEquals("a\tb" + RECORD_SEPARATOR, sw.toString()); + assertEquals(1, printer.getRecordCount()); + } + } + + @Test + void testMongoDbTsvCommaInValue() throws IOException { + final StringWriter sw = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.MONGODB_TSV)) { + printer.printRecord("a,b", "c"); + assertEquals("a,b\tc" + RECORD_SEPARATOR, sw.toString()); + assertEquals(1, printer.getRecordCount()); + } + } + + @Test + void testMongoDbTsvTabInValue() throws IOException { + final StringWriter sw = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.MONGODB_TSV)) { + printer.printRecord("a\tb", "c"); + assertEquals("\"a\tb\"\tc" + RECORD_SEPARATOR, sw.toString()); + } + } + + @Test + void testMultiLineComment() throws IOException { + final StringWriter sw = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withCommentMarker('#'))) { + printer.printComment("This is a comment\non multiple lines"); + assertEquals("# This is a comment" + RECORD_SEPARATOR + "# on multiple lines" + RECORD_SEPARATOR, sw.toString()); + assertEquals(0, printer.getRecordCount()); + } + } + + @Test + void testMySqlNullOutput() throws IOException { Object[] s = new String[] { "NULL", null }; CSVFormat format = CSVFormat.MYSQL.withQuote(DQUOTE_CHAR).withNullString("NULL").withQuoteMode(QuoteMode.NON_NUMERIC); StringWriter writer = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(writer, format)) { + try (CSVPrinter printer = new CSVPrinter(writer, format)) { printer.printRecord(s); } String expected = "\"NULL\"\tNULL\n"; assertEquals(expected, writer.toString()); String[] record0 = toFirstRecordValues(expected, format); - assertArrayEquals(new Object[2], record0); + assertArrayEquals(s, record0); s = new String[] { "\\N", null }; format = CSVFormat.MYSQL.withNullString("\\N"); writer = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(writer, format)) { + try (CSVPrinter printer = new CSVPrinter(writer, format)) { printer.printRecord(s); } expected = "\\\\N\t\\N\n"; @@ -638,7 +1134,7 @@ public void testMySqlNullOutput() throws IOException { s = new String[] { "\\N", "A" }; format = CSVFormat.MYSQL.withNullString("\\N"); writer = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(writer, format)) { + try (CSVPrinter printer = new CSVPrinter(writer, format)) { printer.printRecord(s); } expected = "\\\\N\tA\n"; @@ -649,7 +1145,7 @@ public void testMySqlNullOutput() throws IOException { s = new String[] { "\n", "A" }; format = CSVFormat.MYSQL.withNullString("\\N"); writer = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(writer, format)) { + try (CSVPrinter printer = new CSVPrinter(writer, format)) { printer.printRecord(s); } expected = "\\n\tA\n"; @@ -660,7 +1156,7 @@ public void testMySqlNullOutput() throws IOException { s = new String[] { "", null }; format = CSVFormat.MYSQL.withNullString("NULL"); writer = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(writer, format)) { + try (CSVPrinter printer = new CSVPrinter(writer, format)) { printer.printRecord(s); } expected = "\tNULL\n"; @@ -671,7 +1167,7 @@ public void testMySqlNullOutput() throws IOException { s = new String[] { "", null }; format = CSVFormat.MYSQL; writer = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(writer, format)) { + try (CSVPrinter printer = new CSVPrinter(writer, format)) { printer.printRecord(s); } expected = "\t\\N\n"; @@ -682,7 +1178,7 @@ public void testMySqlNullOutput() throws IOException { s = new String[] { "\\N", "", "\u000e,\\\r" }; format = CSVFormat.MYSQL; writer = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(writer, format)) { + try (CSVPrinter printer = new CSVPrinter(writer, format)) { printer.printRecord(s); } expected = "\\\\N\t\t\u000e,\\\\\\r\n"; @@ -693,7 +1189,7 @@ public void testMySqlNullOutput() throws IOException { s = new String[] { "NULL", "\\\r" }; format = CSVFormat.MYSQL; writer = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(writer, format)) { + try (CSVPrinter printer = new CSVPrinter(writer, format)) { printer.printRecord(s); } expected = "NULL\t\\\\\\r\n"; @@ -704,7 +1200,7 @@ public void testMySqlNullOutput() throws IOException { s = new String[] { "\\\r" }; format = CSVFormat.MYSQL; writer = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(writer, format)) { + try (CSVPrinter printer = new CSVPrinter(writer, format)) { printer.printRecord(s); } expected = "\\\\\\r\n"; @@ -714,47 +1210,53 @@ public void testMySqlNullOutput() throws IOException { } @Test - public void testMySqlNullStringDefault() { + void testMySqlNullStringDefault() { assertEquals("\\N", CSVFormat.MYSQL.getNullString()); } - @Test(expected = IllegalArgumentException.class) - public void testNewCsvPrinterAppendableNullFormat() throws Exception { - try (final CSVPrinter printer = new CSVPrinter(new StringWriter(), null)) { - Assert.fail("This test should have thrown an exception."); - } + @Test + void testNewCsvPrinterAppendableNullFormat() { + assertThrows(NullPointerException.class, () -> new CSVPrinter(new StringWriter(), null)); } - @Test(expected = IllegalArgumentException.class) - public void testNewCSVPrinterNullAppendableFormat() throws Exception { - try (final CSVPrinter printer = new CSVPrinter(null, CSVFormat.DEFAULT)) { - Assert.fail("This test should have thrown an exception."); + @Test + void testNewCsvPrinterNullAppendableFormat() { + assertThrows(NullPointerException.class, () -> new CSVPrinter(null, CSVFormat.DEFAULT)); + } + + @Test + void testNotFlushable() throws IOException { + final Appendable out = new StringBuilder(); + try (CSVPrinter printer = new CSVPrinter(out, CSVFormat.DEFAULT)) { + printer.printRecord("a", "b", "c"); + assertEquals("a,b,c" + RECORD_SEPARATOR, out.toString()); + printer.flush(); } } @Test - public void testParseCustomNullValues() throws IOException { + void testParseCustomNullValues() throws IOException { final StringWriter sw = new StringWriter(); final CSVFormat format = CSVFormat.DEFAULT.withNullString("NULL"); - try (final CSVPrinter printer = new CSVPrinter(sw, format)) { + try (CSVPrinter printer = new CSVPrinter(sw, format)) { printer.printRecord("a", null, "b"); } final String csvString = sw.toString(); - assertEquals("a,NULL,b" + recordSeparator, csvString); - try (final CSVParser iterable = format.parse(new StringReader(csvString))) { + assertEquals("a,NULL,b" + RECORD_SEPARATOR, csvString); + try (CSVParser iterable = format.parse(new StringReader(csvString))) { final Iterator iterator = iterable.iterator(); final CSVRecord record = iterator.next(); assertEquals("a", record.get(0)); - assertEquals(null, record.get(1)); + assertNull(record.get(1)); assertEquals("b", record.get(2)); assertFalse(iterator.hasNext()); } } @Test - public void testPlainEscaped() throws IOException { + void testPlainEscaped() throws IOException { final StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withQuote(null).withEscape('!'))) { + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withQuote(null).withEscape('!'))) { printer.print("abc"); printer.print("xyz"); assertEquals("abc,xyz", sw.toString()); @@ -762,9 +1264,9 @@ public void testPlainEscaped() throws IOException { } @Test - public void testPlainPlain() throws IOException { + void testPlainPlain() throws IOException { final StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withQuote(null))) { + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withQuote(null))) { printer.print("abc"); printer.print("xyz"); assertEquals("abc,xyz", sw.toString()); @@ -772,205 +1274,711 @@ public void testPlainPlain() throws IOException { } @Test - public void testPlainQuoted() throws IOException { + void testPlainQuoted() throws IOException { final StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withQuote('\''))) { + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withQuote('\''))) { printer.print("abc"); assertEquals("abc", sw.toString()); } } @Test - public void testPrint() throws IOException { + @Disabled + void testPostgreSqlCsvNullOutput() throws IOException { + Object[] s = new String[] { "NULL", null }; + CSVFormat format = CSVFormat.POSTGRESQL_CSV.withQuote(DQUOTE_CHAR).withNullString("NULL").withQuoteMode(QuoteMode.ALL_NON_NULL); + StringWriter writer = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(writer, format)) { + printer.printRecord(s); + } + String expected = "\"NULL\",NULL\n"; + assertEquals(expected, writer.toString()); + String[] record0 = toFirstRecordValues(expected, format); + assertArrayEquals(new Object[2], record0); + + s = new String[] { "\\N", null }; + format = CSVFormat.POSTGRESQL_CSV.withNullString("\\N"); + writer = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(writer, format)) { + printer.printRecord(s); + } + expected = "\\\\N\t\\N\n"; + assertEquals(expected, writer.toString()); + record0 = toFirstRecordValues(expected, format); + assertArrayEquals(expectNulls(s, format), record0); + + s = new String[] { "\\N", "A" }; + format = CSVFormat.POSTGRESQL_CSV.withNullString("\\N"); + writer = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(writer, format)) { + printer.printRecord(s); + } + expected = "\\\\N\tA\n"; + assertEquals(expected, writer.toString()); + record0 = toFirstRecordValues(expected, format); + assertArrayEquals(expectNulls(s, format), record0); + + s = new String[] { "\n", "A" }; + format = CSVFormat.POSTGRESQL_CSV.withNullString("\\N"); + writer = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(writer, format)) { + printer.printRecord(s); + } + expected = "\\n\tA\n"; + assertEquals(expected, writer.toString()); + record0 = toFirstRecordValues(expected, format); + assertArrayEquals(expectNulls(s, format), record0); + + s = new String[] { "", null }; + format = CSVFormat.POSTGRESQL_CSV.withNullString("NULL"); + writer = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(writer, format)) { + printer.printRecord(s); + } + expected = "\tNULL\n"; + assertEquals(expected, writer.toString()); + record0 = toFirstRecordValues(expected, format); + assertArrayEquals(expectNulls(s, format), record0); + + s = new String[] { "", null }; + format = CSVFormat.POSTGRESQL_CSV; + writer = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(writer, format)) { + printer.printRecord(s); + } + expected = "\t\\N\n"; + assertEquals(expected, writer.toString()); + record0 = toFirstRecordValues(expected, format); + assertArrayEquals(expectNulls(s, format), record0); + + s = new String[] { "\\N", "", "\u000e,\\\r" }; + format = CSVFormat.POSTGRESQL_CSV; + writer = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(writer, format)) { + printer.printRecord(s); + } + expected = "\\\\N\t\t\u000e,\\\\\\r\n"; + assertEquals(expected, writer.toString()); + record0 = toFirstRecordValues(expected, format); + assertArrayEquals(expectNulls(s, format), record0); + + s = new String[] { "NULL", "\\\r" }; + format = CSVFormat.POSTGRESQL_CSV; + writer = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(writer, format)) { + printer.printRecord(s); + } + expected = "NULL\t\\\\\\r\n"; + assertEquals(expected, writer.toString()); + record0 = toFirstRecordValues(expected, format); + assertArrayEquals(expectNulls(s, format), record0); + + s = new String[] { "\\\r" }; + format = CSVFormat.POSTGRESQL_CSV; + writer = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(writer, format)) { + printer.printRecord(s); + } + expected = "\\\\\\r\n"; + assertEquals(expected, writer.toString()); + record0 = toFirstRecordValues(expected, format); + assertArrayEquals(expectNulls(s, format), record0); + } + + @Test + @Disabled + void testPostgreSqlCsvTextOutput() throws IOException { + Object[] s = new String[] { "NULL", null }; + CSVFormat format = CSVFormat.POSTGRESQL_TEXT.withQuote(DQUOTE_CHAR).withNullString("NULL").withQuoteMode(QuoteMode.ALL_NON_NULL); + StringWriter writer = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(writer, format)) { + printer.printRecord(s); + } + String expected = "\"NULL\"\tNULL\n"; + assertEquals(expected, writer.toString()); + String[] record0 = toFirstRecordValues(expected, format); + assertArrayEquals(new Object[2], record0); + + s = new String[] { "\\N", null }; + format = CSVFormat.POSTGRESQL_TEXT.withNullString("\\N"); + writer = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(writer, format)) { + printer.printRecord(s); + } + expected = "\\\\N\t\\N\n"; + assertEquals(expected, writer.toString()); + record0 = toFirstRecordValues(expected, format); + assertArrayEquals(expectNulls(s, format), record0); + + s = new String[] { "\\N", "A" }; + format = CSVFormat.POSTGRESQL_TEXT.withNullString("\\N"); + writer = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(writer, format)) { + printer.printRecord(s); + } + expected = "\\\\N\tA\n"; + assertEquals(expected, writer.toString()); + record0 = toFirstRecordValues(expected, format); + assertArrayEquals(expectNulls(s, format), record0); + + s = new String[] { "\n", "A" }; + format = CSVFormat.POSTGRESQL_TEXT.withNullString("\\N"); + writer = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(writer, format)) { + printer.printRecord(s); + } + expected = "\\n\tA\n"; + assertEquals(expected, writer.toString()); + record0 = toFirstRecordValues(expected, format); + assertArrayEquals(expectNulls(s, format), record0); + + s = new String[] { "", null }; + format = CSVFormat.POSTGRESQL_TEXT.withNullString("NULL"); + writer = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(writer, format)) { + printer.printRecord(s); + } + expected = "\tNULL\n"; + assertEquals(expected, writer.toString()); + record0 = toFirstRecordValues(expected, format); + assertArrayEquals(expectNulls(s, format), record0); + + s = new String[] { "", null }; + format = CSVFormat.POSTGRESQL_TEXT; + writer = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(writer, format)) { + printer.printRecord(s); + } + expected = "\t\\N\n"; + assertEquals(expected, writer.toString()); + record0 = toFirstRecordValues(expected, format); + assertArrayEquals(expectNulls(s, format), record0); + + s = new String[] { "\\N", "", "\u000e,\\\r" }; + format = CSVFormat.POSTGRESQL_TEXT; + writer = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(writer, format)) { + printer.printRecord(s); + } + expected = "\\\\N\t\t\u000e,\\\\\\r\n"; + assertEquals(expected, writer.toString()); + record0 = toFirstRecordValues(expected, format); + assertArrayEquals(expectNulls(s, format), record0); + + s = new String[] { "NULL", "\\\r" }; + format = CSVFormat.POSTGRESQL_TEXT; + writer = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(writer, format)) { + printer.printRecord(s); + } + expected = "NULL\t\\\\\\r\n"; + assertEquals(expected, writer.toString()); + record0 = toFirstRecordValues(expected, format); + assertArrayEquals(expectNulls(s, format), record0); + + s = new String[] { "\\\r" }; + format = CSVFormat.POSTGRESQL_TEXT; + writer = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(writer, format)) { + printer.printRecord(s); + } + expected = "\\\\\\r\n"; + assertEquals(expected, writer.toString()); + record0 = toFirstRecordValues(expected, format); + assertArrayEquals(expectNulls(s, format), record0); + } + + @Test + void testPostgreSqlNullStringDefaultCsv() { + assertEquals("", CSVFormat.POSTGRESQL_CSV.getNullString()); + } + + @Test + void testPostgreSqlNullStringDefaultText() { + assertEquals("\\N", CSVFormat.POSTGRESQL_TEXT.getNullString()); + } + + @Test + void testPrint() throws IOException { final StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = CSVFormat.DEFAULT.print(sw)) { + try (CSVPrinter printer = CSVFormat.DEFAULT.print(sw)) { + assertInitialState(printer); printer.printRecord("a", "b\\c"); - assertEquals("a,b\\c" + recordSeparator, sw.toString()); + assertEquals("a,b\\c" + RECORD_SEPARATOR, sw.toString()); } } @Test - public void testPrintCustomNullValues() throws IOException { + void testPrintCSVParser() throws IOException { + // @formatter:off + final String code = "a1,b1\n" + // 1) + "a2,b2\n" + // 2) + "a3,b3\n" + // 3) + "a4,b4\n"; // 4) + // @formatter:on + final String[][] res = { { "a1", "b1" }, { "a2", "b2" }, { "a3", "b3" }, { "a4", "b4" } }; + final CSVFormat format = CSVFormat.DEFAULT; final StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withNullString("NULL"))) { - printer.printRecord("a", null, "b"); - assertEquals("a,NULL,b" + recordSeparator, sw.toString()); + try (CSVPrinter printer = format.print(sw); + CSVParser parser = CSVParser.parse(code, format)) { + assertInitialState(printer); + printer.printRecords(parser); + } + try (CSVParser parser = CSVParser.parse(sw.toString(), format)) { + final List records = parser.getRecords(); + assertFalse(records.isEmpty()); + Utils.compare("Fail", res, records, -1); } } @Test - public void testPrinter1() throws IOException { + void testPrintCSVRecord() throws IOException { + // @formatter:off + final String code = "a1,b1\n" + // 1) + "a2,b2\n" + // 2) + "a3,b3\n" + // 3) + "a4,b4\n"; // 4) + // @formatter:on + final String[][] res = { { "a1", "b1" }, { "a2", "b2" }, { "a3", "b3" }, { "a4", "b4" } }; + final CSVFormat format = CSVFormat.DEFAULT; final StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT)) { - printer.printRecord("a", "b"); - assertEquals("a,b" + recordSeparator, sw.toString()); + int row = 0; + try (CSVPrinter printer = format.print(sw); + CSVParser parser = CSVParser.parse(code, format)) { + assertInitialState(printer); + for (final CSVRecord record : parser) { + printer.printRecord(record); + assertEquals(++row, printer.getRecordCount()); + } + assertEquals(row, printer.getRecordCount()); + } + try (CSVParser parser = CSVParser.parse(sw.toString(), format)) { + final List records = parser.getRecords(); + assertFalse(records.isEmpty()); + Utils.compare("Fail", res, records, -1); + } + } + + @ParameterizedTest + @ValueSource(longs = { -1, 0, 3, 4, Long.MAX_VALUE }) + void testPrintCSVRecords(final long maxRows) throws IOException { + // @formatter:off + final String code = "a1,b1\n" + // 1) + "a2,b2\n" + // 2) + "a3,b3\n" + // 3) + "a4,b4\n"; // 4) + // @formatter:on + final String[][] expected = { { "a1", "b1" }, { "a2", "b2" }, { "a3", "b3" }, { "a4", "b4" } }; + final CSVFormat format = CSVFormat.DEFAULT.builder().setMaxRows(maxRows).get(); + final StringWriter sw = new StringWriter(); + try (CSVPrinter printer = format.print(sw); + CSVParser parser = CSVParser.parse(code, format)) { + assertInitialState(printer); + printer.printRecords(parser.getRecords()); + } + try (CSVParser parser = CSVParser.parse(sw.toString(), format)) { + final List records = parser.getRecords(); + assertFalse(records.isEmpty()); + Utils.compare("Fail", expected, records, maxRows); } } @Test - public void testRfc4180QuoteSingleChar() throws IOException { + void testPrintCustomNullValues() throws IOException { final StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.RFC4180)) { - printer.printRecord(EURO_CH, "Deux"); - assertEquals("\"" + EURO_CH + "\",Deux" + recordSeparator, sw.toString()); + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withNullString("NULL"))) { + assertInitialState(printer); + printer.printRecord("a", null, "b"); + assertEquals("a,NULL,b" + RECORD_SEPARATOR, sw.toString()); } } @Test - public void testPrinter2() throws IOException { + void testPrinter1() throws IOException { final StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT)) { + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT)) { + assertInitialState(printer); + printer.printRecord("a", "b"); + assertEquals(1, printer.getRecordCount()); + assertEquals("a,b" + RECORD_SEPARATOR, sw.toString()); + } + } + + @Test + void testPrinter2() throws IOException { + final StringWriter sw = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT)) { + assertInitialState(printer); printer.printRecord("a,b", "b"); - assertEquals("\"a,b\",b" + recordSeparator, sw.toString()); + assertEquals("\"a,b\",b" + RECORD_SEPARATOR, sw.toString()); } } @Test - public void testPrinter3() throws IOException { + void testPrinter3() throws IOException { final StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT)) { + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT)) { + assertInitialState(printer); printer.printRecord("a, b", "b "); - assertEquals("\"a, b\",\"b \"" + recordSeparator, sw.toString()); + assertEquals("\"a, b\",\"b \"" + RECORD_SEPARATOR, sw.toString()); } } @Test - public void testPrinter4() throws IOException { + void testPrinter4() throws IOException { final StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT)) { + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT)) { + assertInitialState(printer); printer.printRecord("a", "b\"c"); - assertEquals("a,\"b\"\"c\"" + recordSeparator, sw.toString()); + assertEquals("a,\"b\"\"c\"" + RECORD_SEPARATOR, sw.toString()); } } @Test - public void testPrinter5() throws IOException { + void testPrinter5() throws IOException { final StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT)) { + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT)) { + assertInitialState(printer); printer.printRecord("a", "b\nc"); - assertEquals("a,\"b\nc\"" + recordSeparator, sw.toString()); + assertEquals("a,\"b\nc\"" + RECORD_SEPARATOR, sw.toString()); } } @Test - public void testPrinter6() throws IOException { + void testPrinter6() throws IOException { final StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT)) { + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT)) { + assertInitialState(printer); printer.printRecord("a", "b\r\nc"); - assertEquals("a,\"b\r\nc\"" + recordSeparator, sw.toString()); + assertEquals("a,\"b\r\nc\"" + RECORD_SEPARATOR, sw.toString()); } } @Test - public void testPrinter7() throws IOException { + void testPrinter7() throws IOException { final StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT)) { + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT)) { + assertInitialState(printer); printer.printRecord("a", "b\\c"); - assertEquals("a,b\\c" + recordSeparator, sw.toString()); + assertEquals("a,b\\c" + RECORD_SEPARATOR, sw.toString()); } } @Test - public void testPrintNullValues() throws IOException { + void testPrintNullValues() throws IOException { final StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT)) { + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT)) { + assertInitialState(printer); printer.printRecord("a", null, "b"); - assertEquals("a,,b" + recordSeparator, sw.toString()); + assertEquals("a,,b" + RECORD_SEPARATOR, sw.toString()); } } @Test - public void testPrintOnePositiveInteger() throws IOException { + void testPrintOnePositiveInteger() throws IOException { final StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withQuoteMode(QuoteMode.MINIMAL))) { + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withQuoteMode(QuoteMode.MINIMAL))) { + assertInitialState(printer); printer.print(Integer.MAX_VALUE); assertEquals(String.valueOf(Integer.MAX_VALUE), sw.toString()); } } + /** + * Test to target the use of {@link IOUtils#copy(java.io.Reader, Appendable)} which directly buffers the value from the Reader to the Appendable. + * + *

+ * Requires the format to have no quote or escape character, value to be a {@link Reader Reader} and the output MUST NOT be a {@link Writer Writer} + * but some other Appendable. + *

+ * + * @throws IOException Not expected to happen + */ + @Test + void testPrintReaderWithoutQuoteToAppendable() throws IOException { + final StringBuilder sb = new StringBuilder(); + final String content = "testValue"; + try (CSVPrinter printer = new CSVPrinter(sb, CSVFormat.DEFAULT.withQuote(null))) { + assertInitialState(printer); + final StringReader value = new StringReader(content); + printer.print(value); + } + assertEquals(content, sb.toString()); + } + + /** + * Test to target the use of {@link IOUtils#copyLarge(java.io.Reader, Writer)} which directly buffers the value from the Reader to the Writer. + * + *

+ * Requires the format to have no quote or escape character, value to be a {@link Reader Reader} and the output MUST be a {@link Writer Writer}. + *

+ * + * @throws IOException Not expected to happen + */ + @Test + void testPrintReaderWithoutQuoteToWriter() throws IOException { + final StringWriter sw = new StringWriter(); + final String content = "testValue"; + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withQuote(null))) { + final StringReader value = new StringReader(content); + printer.print(value); + } + assertEquals(content, sw.toString()); + } + @Test - public void testPrintToFileWithCharsetUtf16Be() throws IOException { - File file = File.createTempFile(getClass().getName(), ".csv"); - try (final CSVPrinter printer = CSVFormat.DEFAULT.print(file, StandardCharsets.UTF_16BE)) { + void testPrintRecordStream() throws IOException { + // @formatter:off + final String code = "a1,b1\n" + // 1) + "a2,b2\n" + // 2) + "a3,b3\n" + // 3) + "a4,b4\n"; // 4) + // @formatter:on + final String[][] res = { { "a1", "b1" }, { "a2", "b2" }, { "a3", "b3" }, { "a4", "b4" } }; + final CSVFormat format = CSVFormat.DEFAULT; + final StringWriter sw = new StringWriter(); + try (CSVPrinter printer = format.print(sw); + CSVParser parser = CSVParser.parse(code, format)) { + long count = 0; + for (final CSVRecord record : parser) { + printer.printRecord(record.stream()); + assertEquals(++count, printer.getRecordCount()); + } + } + try (CSVParser parser = CSVParser.parse(sw.toString(), format)) { + final List records = parser.getRecords(); + assertFalse(records.isEmpty()); + Utils.compare("Fail", res, records, -1); + } + } + + @Test + void testPrintRecordsWithCSVRecord() throws IOException { + final String[] values = { "A", "B", "C" }; + final String rowData = StringUtils.join(values, ','); + final CharArrayWriter charArrayWriter = new CharArrayWriter(0); + try (CSVParser parser = CSVFormat.DEFAULT.parse(new StringReader(rowData)); + CSVPrinter printer = CSVFormat.INFORMIX_UNLOAD.print(charArrayWriter)) { + long count = 0; + for (final CSVRecord record : parser) { + printer.printRecord(record); + assertEquals(++count, printer.getRecordCount()); + } + } + assertEquals(6, charArrayWriter.size()); + assertEquals("A|B|C" + CSVFormat.INFORMIX_UNLOAD.getRecordSeparator(), charArrayWriter.toString()); + } + + @Test + void testPrintRecordsWithEmptyVector() throws IOException { + final PrintStream out = System.out; + try { + System.setOut(new PrintStream(NullOutputStream.INSTANCE)); + try (CSVPrinter printer = CSVFormat.POSTGRESQL_TEXT.printer()) { + final Vector vector = new Vector<>(); + final int expectedCapacity = 23; + vector.setSize(expectedCapacity); + printer.printRecords(vector); + assertEquals(expectedCapacity, vector.capacity()); + assertEquals(expectedCapacity, printer.getRecordCount()); + } + } finally { + System.setOut(out); + } + } + + @Test + void testPrintRecordsWithObjectArray() throws IOException { + final CharArrayWriter charArrayWriter = new CharArrayWriter(0); + final Object[] objectArray = new Object[6]; + try (CSVPrinter printer = CSVFormat.INFORMIX_UNLOAD.print(charArrayWriter)) { + final HashSet hashSet = new HashSet<>(); + objectArray[3] = hashSet; + printer.printRecords(objectArray); + assertEquals(objectArray.length, printer.getRecordCount()); + } + assertEquals(6, charArrayWriter.size()); + assertEquals("\n\n\n\n\n\n", charArrayWriter.toString()); + } + + @Test + void testPrintRecordsWithResultSetOneRow() throws IOException, SQLException { + try (CSVPrinter printer = CSVFormat.MYSQL.printer()) { + try (ResultSet resultSet = new SimpleResultSet()) { + assertInitialState(printer); + printer.printRecords(resultSet); + assertInitialState(printer); + assertEquals(0, resultSet.getRow()); + } + } + } + + @Test + void testPrintToFileWithCharsetUtf16Be() throws IOException { + final File file = createTempFile(); + try (CSVPrinter printer = CSVFormat.DEFAULT.print(file, StandardCharsets.UTF_16BE)) { printer.printRecord("a", "b\\c"); } - assertEquals("a,b\\c" + recordSeparator, FileUtils.readFileToString(file, StandardCharsets.UTF_16BE)); + assertEquals("a,b\\c" + RECORD_SEPARATOR, FileUtils.readFileToString(file, StandardCharsets.UTF_16BE)); } @Test - public void testPrintToFileWithDefaultCharset() throws IOException { - File file = File.createTempFile(getClass().getName(), ".csv"); - try (final CSVPrinter printer = CSVFormat.DEFAULT.print(file, Charset.defaultCharset())) { + void testPrintToFileWithDefaultCharset() throws IOException { + final File file = createTempFile(); + try (CSVPrinter printer = CSVFormat.DEFAULT.print(file, Charset.defaultCharset())) { printer.printRecord("a", "b\\c"); } - assertEquals("a,b\\c" + recordSeparator, FileUtils.readFileToString(file, Charset.defaultCharset())); + assertEquals("a,b\\c" + RECORD_SEPARATOR, FileUtils.readFileToString(file, Charset.defaultCharset())); } @Test - public void testPrintToPathWithDefaultCharset() throws IOException { - File file = File.createTempFile(getClass().getName(), ".csv"); - try (final CSVPrinter printer = CSVFormat.DEFAULT.print(file.toPath(), Charset.defaultCharset())) { + void testPrintToPathWithDefaultCharset() throws IOException { + final Path file = createTempPath(); + try (CSVPrinter printer = CSVFormat.DEFAULT.print(file, Charset.defaultCharset())) { printer.printRecord("a", "b\\c"); } - assertEquals("a,b\\c" + recordSeparator, FileUtils.readFileToString(file, Charset.defaultCharset())); + assertEquals("a,b\\c" + RECORD_SEPARATOR, new String(Files.readAllBytes(file), Charset.defaultCharset())); } @Test - public void testQuoteAll() throws IOException { + void testQuoteAll() throws IOException { final StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withQuoteMode(QuoteMode.ALL))) { + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withQuoteMode(QuoteMode.ALL))) { printer.printRecord("a", "b\nc", "d"); - assertEquals("\"a\",\"b\nc\",\"d\"" + recordSeparator, sw.toString()); + assertEquals("\"a\",\"b\nc\",\"d\"" + RECORD_SEPARATOR, sw.toString()); + } + } + + @Test + void testQuoteCharEscapedWithQuoteModeNone() throws IOException { + final CSVFormat format = CSVFormat.DEFAULT.builder().setQuote('"').setEscape('?').setQuoteMode(QuoteMode.NONE).get(); + final StringWriter sw = new StringWriter(); + final String col1 = "\"abc"; + final String col2 = "x\"y"; + try (CSVPrinter printer = new CSVPrinter(sw, format)) { + printer.printRecord(col1, col2); + printer.printRecord(new StringReader(col1), new StringReader(col2)); + } + assertEquals("?\"abc,x?\"y" + RECORD_SEPARATOR + "?\"abc,x?\"y" + RECORD_SEPARATOR, sw.toString()); + // The emitted records must read back as the original values. + try (CSVParser parser = CSVParser.parse(sw.toString(), format)) { + final List records = parser.getRecords(); + assertEquals(2, records.size()); + for (final CSVRecord record : records) { + assertEquals(col1, record.get(0)); + assertEquals(col2, record.get(1)); + } + } + } + + @Test + void testQuoteCommaFirstChar() throws IOException { + final StringWriter sw = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.RFC4180)) { + printer.printRecord(","); + assertEquals("\",\"" + RECORD_SEPARATOR, sw.toString()); } } @Test - public void testQuoteNonNumeric() throws IOException { + void testQuoteCommentMarkerFirstChar() throws IOException { + final CSVFormat format = CSVFormat.DEFAULT.builder().setCommentMarker(';').get(); final StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withQuoteMode(QuoteMode.NON_NUMERIC))) { + final String col1 = ";comment-like"; + try (CSVPrinter printer = new CSVPrinter(sw, format)) { + // A real comment is written with the marker, unquoted. + printer.printComment("a real comment"); + // A value starting with the marker is quoted, so it does not read back as a comment. + printer.printRecord(col1, "b"); + // The marker past the first character does not start a comment, so only the leading-marker value is quoted. + printer.printRecord("a;b", ";c"); + } + final String string = sw.toString(); + assertEquals("; a real comment" + RECORD_SEPARATOR + + "\";comment-like\",b" + RECORD_SEPARATOR + + "a;b,\";c\"" + RECORD_SEPARATOR, string); + // The comment is dropped on read; both data records survive intact. + try (CSVParser parser = CSVParser.parse(string, format)) { + final List records = parser.getRecords(); + assertEquals(2, records.size()); + assertEquals(col1, records.get(0).get(0)); + assertEquals("b", records.get(0).get(1)); + assertEquals("a;b", records.get(1).get(0)); + assertEquals(";c", records.get(1).get(1)); + } + } + + @Test + void testQuoteNonNumeric() throws IOException { + final StringWriter sw = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withQuoteMode(QuoteMode.NON_NUMERIC))) { printer.printRecord("a", "b\nc", Integer.valueOf(1)); - assertEquals("\"a\",\"b\nc\",1" + recordSeparator, sw.toString()); + assertEquals("\"a\",\"b\nc\",1" + RECORD_SEPARATOR, sw.toString()); } } @Test - public void testRandomDefault() throws Exception { + void testRandomDefault() throws Exception { doRandom(CSVFormat.DEFAULT, ITERATIONS_FOR_RANDOM_TEST); } @Test - public void testRandomExcel() throws Exception { + void testRandomExcel() throws Exception { doRandom(CSVFormat.EXCEL, ITERATIONS_FOR_RANDOM_TEST); } @Test - public void testRandomMySql() throws Exception { + @Disabled + void testRandomMongoDbCsv() throws Exception { + doRandom(CSVFormat.MONGODB_CSV, ITERATIONS_FOR_RANDOM_TEST); + } + + @Test + void testRandomMySql() throws Exception { doRandom(CSVFormat.MYSQL, ITERATIONS_FOR_RANDOM_TEST); } @Test - public void testRandomRfc4180() throws Exception { + @Disabled + void testRandomOracle() throws Exception { + doRandom(CSVFormat.ORACLE, ITERATIONS_FOR_RANDOM_TEST); + } + + @Test + @Disabled + void testRandomPostgreSqlCsv() throws Exception { + doRandom(CSVFormat.POSTGRESQL_CSV, ITERATIONS_FOR_RANDOM_TEST); + } + + @Test + void testRandomPostgreSqlText() throws Exception { + doRandom(CSVFormat.POSTGRESQL_TEXT, ITERATIONS_FOR_RANDOM_TEST); + } + + @Test + void testRandomRfc4180() throws Exception { doRandom(CSVFormat.RFC4180, ITERATIONS_FOR_RANDOM_TEST); } @Test - public void testRandomTdf() throws Exception { + void testRandomTdf() throws Exception { doRandom(CSVFormat.TDF, ITERATIONS_FOR_RANDOM_TEST); } @Test - public void testSingleLineComment() throws IOException { + void testSingleLineComment() throws IOException { final StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withCommentMarker('#'))) { + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withCommentMarker('#'))) { printer.printComment("This is a comment"); - assertEquals("# This is a comment" + recordSeparator, sw.toString()); + assertEquals("# This is a comment" + RECORD_SEPARATOR, sw.toString()); + assertEquals(0, printer.getRecordCount()); } } @Test - public void testSingleQuoteQuoted() throws IOException { + void testSingleQuoteQuoted() throws IOException { final StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withQuote('\''))) { + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withQuote('\''))) { printer.print("a'b'c"); printer.print("xyz"); assertEquals("'a''b''c',xyz", sw.toString()); @@ -978,11 +1986,10 @@ public void testSingleQuoteQuoted() throws IOException { } @Test - public void testSkipHeaderRecordFalse() throws IOException { + void testSkipHeaderRecordFalse() throws IOException { // functionally identical to testHeader, used to test CSV-153 final StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, - CSVFormat.DEFAULT.withQuote(null).withHeader("C1", "C2", "C3").withSkipHeaderRecord(false))) { + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withQuote(null).withHeader("C1", "C2", "C3").withSkipHeaderRecord(false))) { printer.printRecord("a", "b", "c"); printer.printRecord("x", "y", "z"); assertEquals("C1,C2,C3\r\na,b,c\r\nx,y,z\r\n", sw.toString()); @@ -990,11 +1997,10 @@ public void testSkipHeaderRecordFalse() throws IOException { } @Test - public void testSkipHeaderRecordTrue() throws IOException { + void testSkipHeaderRecordTrue() throws IOException { // functionally identical to testHeaderNotSet, used to test CSV-153 final StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, - CSVFormat.DEFAULT.withQuote(null).withHeader("C1", "C2", "C3").withSkipHeaderRecord(true))) { + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withQuote(null).withHeader("C1", "C2", "C3").withSkipHeaderRecord(true))) { printer.printRecord("a", "b", "c"); printer.printRecord("x", "y", "z"); assertEquals("a,b,c\r\nx,y,z\r\n", sw.toString()); @@ -1002,36 +2008,36 @@ public void testSkipHeaderRecordTrue() throws IOException { } @Test - public void testTrailingDelimiterOnTwoColumns() throws IOException { + void testTrailingDelimiterOnTwoColumns() throws IOException { final StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withTrailingDelimiter())) { + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withTrailingDelimiter())) { printer.printRecord("A", "B"); assertEquals("A,B,\r\n", sw.toString()); } } @Test - public void testTrimOffOneColumn() throws IOException { + void testTrimOffOneColumn() throws IOException { final StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withTrim(false))) { + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withTrim(false))) { printer.print(" A "); assertEquals("\" A \"", sw.toString()); } } @Test - public void testTrimOnOneColumn() throws IOException { + void testTrimOnOneColumn() throws IOException { final StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withTrim())) { + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withTrim())) { printer.print(" A "); assertEquals("A", sw.toString()); } } @Test - public void testTrimOnTwoColumns() throws IOException { + void testTrimOnTwoColumns() throws IOException { final StringWriter sw = new StringWriter(); - try (final CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withTrim())) { + try (CSVPrinter printer = new CSVPrinter(sw, CSVFormat.DEFAULT.withTrim())) { printer.print(" A "); printer.print(" B "); assertEquals("A,B", sw.toString()); @@ -1039,6 +2045,18 @@ public void testTrimOnTwoColumns() throws IOException { } private String[] toFirstRecordValues(final String expected, final CSVFormat format) throws IOException { - return CSVParser.parse(expected, format).getRecords().get(0).values(); + try (CSVParser parser = CSVParser.parse(expected, format)) { + return parser.getRecords().get(0).values(); + } } + + private void tryFormat(final List list, final Character quote, final Character escape, final String expected) throws IOException { + final CSVFormat format = CSVFormat.DEFAULT.withQuote(quote).withEscape(escape).withRecordSeparator(null); + final Appendable out = new StringBuilder(); + try (CSVPrinter printer = new CSVPrinter(out, format)) { + printer.printRecord(list); + } + assertEquals(expected, out.toString()); + } + } diff --git a/src/test/java/org/apache/commons/csv/CSVRecordTest.java b/src/test/java/org/apache/commons/csv/CSVRecordTest.java index 6347cc51a4..94060d62b2 100644 --- a/src/test/java/org/apache/commons/csv/CSVRecordTest.java +++ b/src/test/java/org/apache/commons/csv/CSVRecordTest.java @@ -1,128 +1,235 @@ /* - * Licensed to the Apache Software Foundation (ASF) under one or more - * contributor license agreements. See the NOTICE file distributed with - * this work for additional information regarding copyright ownership. - * The ASF licenses this file to You under the Apache License, Version 2.0 - * (the "License"); you may not use this file except in compliance with - * the License. You may obtain a copy of the License at + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at * - * http://www.apache.org/licenses/LICENSE-2.0 + * https://www.apache.org/licenses/LICENSE-2.0 * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. */ package org.apache.commons.csv; -import static org.junit.Assert.assertEquals; -import static org.junit.Assert.assertFalse; -import static org.junit.Assert.assertNotNull; -import static org.junit.Assert.assertNull; -import static org.junit.Assert.assertTrue; +import static org.junit.jupiter.api.Assertions.assertAll; +import static org.junit.jupiter.api.Assertions.assertArrayEquals; +import static org.junit.jupiter.api.Assertions.assertEquals; +import static org.junit.jupiter.api.Assertions.assertFalse; +import static org.junit.jupiter.api.Assertions.assertInstanceOf; +import static org.junit.jupiter.api.Assertions.assertNotNull; +import static org.junit.jupiter.api.Assertions.assertNull; +import static org.junit.jupiter.api.Assertions.assertThrows; +import static org.junit.jupiter.api.Assertions.assertTrue; +import java.io.ByteArrayInputStream; +import java.io.ByteArrayOutputStream; import java.io.IOException; +import java.io.ObjectInputStream; +import java.io.ObjectOutputStream; +import java.io.StringReader; import java.util.ArrayList; -import java.util.Collections; -import java.util.HashMap; +import java.util.List; import java.util.Map; import java.util.TreeMap; import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.atomic.AtomicInteger; -import org.junit.Assert; -import org.junit.Before; -import org.junit.Test; +import org.apache.commons.lang3.StringUtils; +import org.junit.jupiter.api.BeforeEach; +import org.junit.jupiter.api.Test; -public class CSVRecordTest { +class CSVRecordTest { - private enum EnumFixture { UNKNOWN_COLUMN } + private enum EnumFixture { + UNKNOWN_COLUMN + } + + /** This enum overrides toString() but it's the names that matter. */ + public enum EnumHeader { + FIRST("first"), SECOND("second"), THIRD("third"); + + private final String number; + + EnumHeader(final String number) { + this.number = number; + } + @Override + public String toString() { + return number; + } + } + + private Map headerMap; + private CSVRecord record; + private CSVRecord recordWithHeader; private String[] values; - private CSVRecord record, recordWithHeader; - private Map header; - @Before + @BeforeEach public void setUp() throws Exception { values = new String[] { "A", "B", "C" }; - record = new CSVRecord(values, null, null, 0, -1); - header = new HashMap<>(); - header.put("first", Integer.valueOf(0)); - header.put("second", Integer.valueOf(1)); - header.put("third", Integer.valueOf(2)); - recordWithHeader = new CSVRecord(values, header, null, 0, -1); + final String rowData = StringUtils.join(values, ','); + try (CSVParser parser = CSVFormat.DEFAULT.parse(new StringReader(rowData))) { + record = parser.iterator().next(); + } + try (CSVParser parser = CSVFormat.DEFAULT.builder().setHeader(EnumHeader.class).get().parse(new StringReader(rowData))) { + recordWithHeader = parser.iterator().next(); + headerMap = parser.getHeaderMap(); + } + } + + @Test + void testCSVRecordNULLValues() throws IOException { + try (CSVParser parser = CSVParser.parse("A,B\r\nONE,TWO", CSVFormat.DEFAULT.withHeader())) { + final CSVRecord csvRecord = new CSVRecord(parser, null, null, 0L, 0L, 0L); + assertEquals(0, csvRecord.size()); + assertThrows(IllegalArgumentException.class, () -> csvRecord.get("B")); + } + } + + @Test + void testDuplicateHeaderGet() throws IOException { + final String csv = "A,A,B,B\n1,2,5,6\n"; + final CSVFormat format = CSVFormat.DEFAULT.builder().setHeader().get(); + + try (CSVParser parser = CSVParser.parse(csv, format)) { + final CSVRecord record = parser.nextRecord(); + + assertAll("Test that it gets the last instance of a column when there are duplicate headings", + () -> assertEquals("2", record.get("A")), + () -> assertEquals("6", record.get("B")) + ); + } } @Test - public void testGetInt() { + void testDuplicateHeaderToMap() throws IOException { + final String csv = "A,A,B,B\n1,2,5,6\n"; + final CSVFormat format = CSVFormat.DEFAULT.builder().setHeader().get(); + + try (CSVParser parser = CSVParser.parse(csv, format)) { + final CSVRecord record = parser.nextRecord(); + final Map map = record.toMap(); + + assertAll("Test that it gets the last instance of a column when there are duplicate headings", + () -> assertEquals("2", map.get("A")), + () -> assertEquals("6", map.get("B")) + ); + } + } + + @Test + void testGetInt() { assertEquals(values[0], record.get(0)); assertEquals(values[1], record.get(1)); assertEquals(values[2], record.get(2)); } @Test - public void testGetString() { - assertEquals(values[0], recordWithHeader.get("first")); - assertEquals(values[1], recordWithHeader.get("second")); - assertEquals(values[2], recordWithHeader.get("third")); + void testGetNullEnum() { + assertThrows(IllegalArgumentException.class, () -> recordWithHeader.get((Enum) null)); } - @Test(expected = IllegalArgumentException.class) - public void testGetStringInconsistentRecord() { - header.put("fourth", Integer.valueOf(4)); - recordWithHeader.get("fourth"); + @Test + void testGetString() { + assertEquals(values[0], recordWithHeader.get(EnumHeader.FIRST.name())); + assertEquals(values[1], recordWithHeader.get(EnumHeader.SECOND.name())); + assertEquals(values[2], recordWithHeader.get(EnumHeader.THIRD.name())); } - @Test(expected = IllegalStateException.class) - public void testGetStringNoHeader() { - record.get("first"); + @Test + void testGetStringInconsistentRecord() { + headerMap.put("fourth", Integer.valueOf(4)); + assertThrows(IllegalArgumentException.class, () -> recordWithHeader.get("fourth")); } - @Test(expected = IllegalArgumentException.class) - public void testGetUnmappedEnum() { - assertNull(recordWithHeader.get(EnumFixture.UNKNOWN_COLUMN)); + @Test + void testGetStringNoHeader() { + assertThrows(IllegalStateException.class, () -> record.get("first")); } - @Test(expected = IllegalArgumentException.class) - public void testGetUnmappedName() { - assertNull(recordWithHeader.get("fourth")); + @Test + void testGetUnmappedEnum() { + assertThrows(IllegalArgumentException.class, () -> recordWithHeader.get(EnumFixture.UNKNOWN_COLUMN)); } - @Test(expected = ArrayIndexOutOfBoundsException.class) - public void testGetUnmappedNegativeInt() { - assertNull(recordWithHeader.get(Integer.MIN_VALUE)); + @Test + void testGetUnmappedName() { + assertThrows(IllegalArgumentException.class, () -> assertNull(recordWithHeader.get("fourth"))); } - @Test(expected = ArrayIndexOutOfBoundsException.class) - public void testGetUnmappedPositiveInt() { - assertNull(recordWithHeader.get(Integer.MAX_VALUE)); + @Test + void testGetUnmappedNegativeInt() { + assertThrows(ArrayIndexOutOfBoundsException.class, () -> recordWithHeader.get(Integer.MIN_VALUE)); + } + + @Test + void testGetUnmappedPositiveInt() { + assertThrows(ArrayIndexOutOfBoundsException.class, () -> recordWithHeader.get(Integer.MAX_VALUE)); } @Test - public void testIsConsistent() { + void testGetWithEnum() { + assertEquals(recordWithHeader.get("FIRST"), recordWithHeader.get(EnumHeader.FIRST)); + assertEquals(recordWithHeader.get("SECOND"), recordWithHeader.get(EnumHeader.SECOND)); + assertThrows(IllegalArgumentException.class, () -> recordWithHeader.get(EnumFixture.UNKNOWN_COLUMN)); + } + + @Test + void testIsConsistent() { assertTrue(record.isConsistent()); assertTrue(recordWithHeader.isConsistent()); + final Map map = recordWithHeader.getParser().getHeaderMap(); + map.put("fourth", Integer.valueOf(4)); + // We are working on a copy of the map, so the record should still be OK. + assertTrue(recordWithHeader.isConsistent()); + } - header.put("fourth", Integer.valueOf(4)); - assertFalse(recordWithHeader.isConsistent()); + @Test + void testIsInconsistent() throws IOException { + final String[] headers = { "first", "second", "third" }; + final String rowData = StringUtils.join(values, ','); + try (CSVParser parser = CSVFormat.DEFAULT.withHeader(headers).parse(new StringReader(rowData))) { + final Map map = parser.getHeaderMapRaw(); + final CSVRecord record1 = parser.iterator().next(); + map.put("fourth", Integer.valueOf(4)); + assertFalse(record1.isConsistent()); + } } @Test - public void testIsMapped() { + void testIsMapped() { assertFalse(record.isMapped("first")); - assertTrue(recordWithHeader.isMapped("first")); + assertTrue(recordWithHeader.isMapped(EnumHeader.FIRST.name())); assertFalse(recordWithHeader.isMapped("fourth")); } @Test - public void testIsSet() { + void testIsSetInt() { + assertFalse(record.isSet(-1)); + assertTrue(record.isSet(0)); + assertTrue(record.isSet(2)); + assertFalse(record.isSet(3)); + assertTrue(recordWithHeader.isSet(1)); + assertFalse(recordWithHeader.isSet(1000)); + } + + @Test + void testIsSetString() { assertFalse(record.isSet("first")); - assertTrue(recordWithHeader.isSet("first")); - assertFalse(recordWithHeader.isSet("fourth")); + assertTrue(recordWithHeader.isSet(EnumHeader.FIRST.name())); + assertFalse(recordWithHeader.isSet("DOES NOT EXIST")); } @Test - public void testIterator() { + void testIterator() { int i = 0; for (final String value : record) { assertEquals(values[i], value); @@ -131,66 +238,152 @@ public void testIterator() { } @Test - public void testPutInMap() { + void testPutInMap() { final Map map = new ConcurrentHashMap<>(); this.recordWithHeader.putIn(map); - this.validateMap(map, false); - // Test that we can compile with assigment to the same map as the param. - final TreeMap map2 = recordWithHeader.putIn(new TreeMap()); - this.validateMap(map2, false); + validateMap(map, false); + // Test that we can compile with assignment to the same map as the param. + final TreeMap map2 = recordWithHeader.putIn(new TreeMap<>()); + validateMap(map2, false); } @Test - public void testRemoveAndAddColumns() throws IOException { + void testRemoveAndAddColumns() throws IOException { // do: - try (final CSVPrinter printer = new CSVPrinter(new StringBuilder(), CSVFormat.DEFAULT)) { + try (CSVPrinter printer = new CSVPrinter(new StringBuilder(), CSVFormat.DEFAULT)) { final Map map = recordWithHeader.toMap(); map.remove("OldColumn"); map.put("ZColumn", "NewValue"); // check: final ArrayList list = new ArrayList<>(map.values()); - Collections.sort(list); + list.sort(null); printer.printRecord(list); - Assert.assertEquals("A,B,C,NewValue" + CSVFormat.DEFAULT.getRecordSeparator(), printer.getOut().toString()); + assertEquals("A,B,C,NewValue" + CSVFormat.DEFAULT.getRecordSeparator(), printer.getOut().toString()); + } + } + + @Test + void testSerialization() throws IOException, ClassNotFoundException { + final CSVRecord shortRec; + try (CSVParser parser = CSVParser.parse("A,B\n#my comment\nOne,Two", CSVFormat.DEFAULT.withHeader().withCommentMarker('#'))) { + shortRec = parser.iterator().next(); } + final ByteArrayOutputStream out = new ByteArrayOutputStream(); + try (ObjectOutputStream oos = new ObjectOutputStream(out)) { + oos.writeObject(shortRec); + } + final ByteArrayInputStream in = new ByteArrayInputStream(out.toByteArray()); + try (ObjectInputStream ois = new ObjectInputStream(in)) { + final Object object = ois.readObject(); + assertInstanceOf(CSVRecord.class, object); + final CSVRecord rec = (CSVRecord) object; + assertEquals(1L, rec.getRecordNumber()); + assertEquals("One", rec.get(0)); + assertEquals("Two", rec.get(1)); + assertEquals(2, rec.size()); + assertEquals(shortRec.getCharacterPosition(), rec.getCharacterPosition()); + assertEquals("my comment", rec.getComment()); + // The parser is not serialized + assertNull(rec.getParser()); + // Check all header map functionality is absent + assertTrue(rec.isConsistent()); + assertFalse(rec.isMapped("A")); + assertFalse(rec.isSet("A")); + assertEquals(0, rec.toMap().size()); + // This will throw + assertThrows(IllegalStateException.class, () -> rec.get("A")); + } + } + + @Test + void testStream() { + final AtomicInteger i = new AtomicInteger(); + record.stream().forEach(value -> { + assertEquals(values[i.get()], value); + i.incrementAndGet(); + }); } @Test - public void testToMap() { + void testToListAdd() { + final String[] expected = values.clone(); + final List list = record.toList(); + list.add("Last"); + assertEquals("Last", list.get(list.size() - 1)); + assertEquals(list.size(), values.length + 1); + assertArrayEquals(expected, values); + } + + @Test + void testToListFor() { + int i = 0; + for (final String value : record.toList()) { + assertEquals(values[i], value); + i++; + } + } + + @Test + void testToListForEach() { + final AtomicInteger i = new AtomicInteger(); + record.toList().forEach(e -> { + assertEquals(values[i.getAndIncrement()], e); + }); + } + + @Test + void testToListSet() { + final String[] expected = values.clone(); + final List list = record.toList(); + list.set(list.size() - 1, "Last"); + assertEquals("Last", list.get(list.size() - 1)); + assertEquals(list.size(), values.length); + assertArrayEquals(expected, values); + } + + @Test + void testToMap() { final Map map = this.recordWithHeader.toMap(); - this.validateMap(map, true); + validateMap(map, true); } @Test - public void testToMapWithShortRecord() throws Exception { - try (final CSVParser parser = CSVParser.parse("a,b", CSVFormat.DEFAULT.withHeader("A", "B", "C"))) { + void testToMapWithNoHeader() throws Exception { + try (CSVParser parser = CSVParser.parse("a,b", CSVFormat.newFormat(','))) { final CSVRecord shortRec = parser.iterator().next(); - shortRec.toMap(); + final Map map = shortRec.toMap(); + assertNotNull(map, "Map is not null."); + assertTrue(map.isEmpty(), "Map is empty."); } } @Test - public void testToMapWithNoHeader() throws Exception { - try (final CSVParser parser = CSVParser.parse("a,b", CSVFormat.newFormat(','))) { + void testToMapWithShortRecord() throws Exception { + try (CSVParser parser = CSVParser.parse("a,b", CSVFormat.DEFAULT.withHeader("A", "B", "C"))) { final CSVRecord shortRec = parser.iterator().next(); - final Map map = shortRec.toMap(); - assertNotNull("Map is not null.", map); - assertTrue("Map is empty.", map.isEmpty()); + shortRec.toMap(); } } + @Test + void testToString() { + assertNotNull(recordWithHeader.toString()); + assertTrue(recordWithHeader.toString().contains("comment=")); + assertTrue(recordWithHeader.toString().contains("recordNumber=")); + assertTrue(recordWithHeader.toString().contains("values=")); + } + private void validateMap(final Map map, final boolean allowsNulls) { - assertTrue(map.containsKey("first")); - assertTrue(map.containsKey("second")); - assertTrue(map.containsKey("third")); + assertTrue(map.containsKey(EnumHeader.FIRST.name())); + assertTrue(map.containsKey(EnumHeader.SECOND.name())); + assertTrue(map.containsKey(EnumHeader.THIRD.name())); assertFalse(map.containsKey("fourth")); if (allowsNulls) { assertFalse(map.containsKey(null)); } - assertEquals("A", map.get("first")); - assertEquals("B", map.get("second")); - assertEquals("C", map.get("third")); - assertEquals(null, map.get("fourth")); + assertEquals("A", map.get(EnumHeader.FIRST.name())); + assertEquals("B", map.get(EnumHeader.SECOND.name())); + assertEquals("C", map.get(EnumHeader.THIRD.name())); + assertNull(map.get("fourth")); } - } diff --git a/src/test/java/org/apache/commons/csv/CsvAssertions.java b/src/test/java/org/apache/commons/csv/CsvAssertions.java new file mode 100644 index 0000000000..b6c2b5d9cd --- /dev/null +++ b/src/test/java/org/apache/commons/csv/CsvAssertions.java @@ -0,0 +1,29 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * https://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.commons.csv; + +import static org.junit.jupiter.api.Assertions.assertArrayEquals; + +public class CsvAssertions { + + public static void assertValuesEquals(final String[] expected, final CSVRecord actual) { + assertArrayEquals(expected, actual.values()); + } +} diff --git a/src/test/java/org/apache/commons/csv/ExtendedBufferedReaderTest.java b/src/test/java/org/apache/commons/csv/ExtendedBufferedReaderTest.java index 86f33353f9..b8d9b9f198 100644 --- a/src/test/java/org/apache/commons/csv/ExtendedBufferedReaderTest.java +++ b/src/test/java/org/apache/commons/csv/ExtendedBufferedReaderTest.java @@ -1,115 +1,231 @@ /* - * Licensed to the Apache Software Foundation (ASF) under one or more - * contributor license agreements. See the NOTICE file distributed with - * this work for additional information regarding copyright ownership. - * The ASF licenses this file to You under the Apache License, Version 2.0 - * (the "License"); you may not use this file except in compliance with - * the License. You may obtain a copy of the License at + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at * - * http://www.apache.org/licenses/LICENSE-2.0 + * https://www.apache.org/licenses/LICENSE-2.0 * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. */ package org.apache.commons.csv; -import static org.apache.commons.csv.Constants.END_OF_STREAM; import static org.apache.commons.csv.Constants.UNDEFINED; -import static org.junit.Assert.assertArrayEquals; -import static org.junit.Assert.assertEquals; -import static org.junit.Assert.assertNull; +import static org.apache.commons.io.IOUtils.EOF; +import static org.junit.jupiter.api.Assertions.assertArrayEquals; +import static org.junit.jupiter.api.Assertions.assertEquals; +import static org.junit.jupiter.api.Assertions.assertNull; import java.io.StringReader; +import java.nio.charset.StandardCharsets; -import org.junit.Test; +import org.junit.jupiter.api.Test; /** - * - * - * @version $Id$ + * Test {@link ExtendedBufferedReader}. */ -public class ExtendedBufferedReaderTest { +class ExtendedBufferedReaderTest { + + static final String LF = "\n"; + static final String CR = "\r"; + static final String CRLF = CR + LF; + static final String LFCR = LF + CR; // easier to read the string below + + private ExtendedBufferedReader createBufferedReader(final String s) { + return new ExtendedBufferedReader(new StringReader(s)); + } @Test - public void testEmptyInput() throws Exception { - try (final ExtendedBufferedReader br = createBufferedReader("")) { - assertEquals(END_OF_STREAM, br.read()); - assertEquals(END_OF_STREAM, br.lookAhead()); - assertEquals(END_OF_STREAM, br.getLastChar()); + void testEmptyInput() throws Exception { + try (ExtendedBufferedReader br = createBufferedReader("")) { + assertEquals(EOF, br.read()); + assertEquals(EOF, br.peek()); + assertEquals(EOF, br.getLastChar()); assertNull(br.readLine()); assertEquals(0, br.read(new char[10], 0, 0)); } } + /* + * Test to illustrate https://issues.apache.org/jira/browse/CSV-75 + */ + @Test + void testReadChar() throws Exception { + final String test = "a" + LF + "b" + CR + "c" + LF + LF + "d" + CR + CR + "e" + LFCR + "f " + CRLF; + // EOL eol EOL EOL eol eol EOL+CR EOL + final int eolCount = 9; + + try (ExtendedBufferedReader br = createBufferedReader(test)) { + assertEquals(0, br.getLineNumber()); + int lineCount = 0; + while (br.readLine() != null) { + // consume all + lineCount++; + } + assertEquals(eolCount, br.getLineNumber()); + assertEquals(lineCount, br.getLineNumber()); + } + try (ExtendedBufferedReader br = createBufferedReader(test)) { + assertEquals(0, br.getLineNumber()); + int readCount = 0; + while (br.read() != EOF) { + // consume all + readCount++; + } + assertEquals(eolCount, br.getLineNumber()); + assertEquals(readCount, test.length()); + } + try (ExtendedBufferedReader br = createBufferedReader(test)) { + assertEquals(0, br.getLineNumber()); + final char[] buff = new char[10]; + while (br.read(buff, 0, 3) != EOF) { + // consume all + } + assertEquals(eolCount, br.getLineNumber()); + } + } + @Test - public void testReadLookahead1() throws Exception { - try (final ExtendedBufferedReader br = createBufferedReader("1\n2\r3\n")) { - assertEquals(0, br.getCurrentLineNumber()); - assertEquals('1', br.lookAhead()); + void testReadingInDifferentBuffer() throws Exception { + final char[] tmp1 = new char[2]; + final char[] tmp2 = new char[4]; + try (ExtendedBufferedReader reader = createBufferedReader("1\r\n2\r\n")) { + reader.read(tmp1, 0, 2); + reader.read(tmp2, 2, 2); + assertEquals(2, reader.getLineNumber()); + } + } + + @Test + void testReadingSupplementaryCharacterTracksBytes() throws Exception { + final String input = "😀"; + final char[] buffer = new char[input.length()]; + try (ExtendedBufferedReader reader = new ExtendedBufferedReader(new StringReader(input), StandardCharsets.UTF_8, true)) { + assertEquals(input.length(), reader.read(buffer, 0, buffer.length)); + assertArrayEquals(input.toCharArray(), buffer); + assertEquals(input.getBytes(StandardCharsets.UTF_8).length, reader.getBytesRead()); + assertEquals(input.length(), reader.getPosition()); + assertEquals(input.charAt(input.length() - 1), reader.getLastChar()); + } + } + + @Test + void testReadLine() throws Exception { + try (ExtendedBufferedReader br = createBufferedReader("")) { + assertNull(br.readLine()); + } + try (ExtendedBufferedReader br = createBufferedReader("\n")) { + assertEquals("", br.readLine()); + assertNull(br.readLine()); + } + try (ExtendedBufferedReader br = createBufferedReader("foo\n\nhello")) { + assertEquals(0, br.getLineNumber()); + assertEquals("foo", br.readLine()); + assertEquals(1, br.getLineNumber()); + assertEquals("", br.readLine()); + assertEquals(2, br.getLineNumber()); + assertEquals("hello", br.readLine()); + assertEquals(3, br.getLineNumber()); + assertNull(br.readLine()); + assertEquals(3, br.getLineNumber()); + } + try (ExtendedBufferedReader br = createBufferedReader("foo\n\nhello")) { + assertEquals('f', br.read()); + assertEquals('o', br.peek()); + assertEquals("oo", br.readLine()); + assertEquals(1, br.getLineNumber()); + assertEquals('\n', br.peek()); + assertEquals("", br.readLine()); + assertEquals(2, br.getLineNumber()); + assertEquals('h', br.peek()); + assertEquals("hello", br.readLine()); + assertNull(br.readLine()); + assertEquals(3, br.getLineNumber()); + } + try (ExtendedBufferedReader br = createBufferedReader("foo\rbaar\r\nfoo")) { + assertEquals("foo", br.readLine()); + assertEquals('b', br.peek()); + assertEquals("baar", br.readLine()); + assertEquals('f', br.peek()); + assertEquals("foo", br.readLine()); + assertNull(br.readLine()); + } + } + + @Test + void testReadLookahead1() throws Exception { + try (ExtendedBufferedReader br = createBufferedReader("1\n2\r3\n")) { + assertEquals(0, br.getLineNumber()); + assertEquals('1', br.peek()); assertEquals(UNDEFINED, br.getLastChar()); - assertEquals(0, br.getCurrentLineNumber()); + assertEquals(0, br.getLineNumber()); assertEquals('1', br.read()); // Start line 1 assertEquals('1', br.getLastChar()); - assertEquals(1, br.getCurrentLineNumber()); - assertEquals('\n', br.lookAhead()); - assertEquals(1, br.getCurrentLineNumber()); + assertEquals(1, br.getLineNumber()); + assertEquals('\n', br.peek()); + assertEquals(1, br.getLineNumber()); assertEquals('1', br.getLastChar()); assertEquals('\n', br.read()); - assertEquals(1, br.getCurrentLineNumber()); + assertEquals(1, br.getLineNumber()); assertEquals('\n', br.getLastChar()); - assertEquals(1, br.getCurrentLineNumber()); + assertEquals(1, br.getLineNumber()); - assertEquals('2', br.lookAhead()); - assertEquals(1, br.getCurrentLineNumber()); + assertEquals('2', br.peek()); + assertEquals(1, br.getLineNumber()); assertEquals('\n', br.getLastChar()); - assertEquals(1, br.getCurrentLineNumber()); + assertEquals(1, br.getLineNumber()); assertEquals('2', br.read()); // Start line 2 - assertEquals(2, br.getCurrentLineNumber()); + assertEquals(2, br.getLineNumber()); assertEquals('2', br.getLastChar()); - assertEquals('\r', br.lookAhead()); - assertEquals(2, br.getCurrentLineNumber()); + assertEquals('\r', br.peek()); + assertEquals(2, br.getLineNumber()); assertEquals('2', br.getLastChar()); assertEquals('\r', br.read()); assertEquals('\r', br.getLastChar()); - assertEquals(2, br.getCurrentLineNumber()); + assertEquals(2, br.getLineNumber()); - assertEquals('3', br.lookAhead()); + assertEquals('3', br.peek()); assertEquals('\r', br.getLastChar()); assertEquals('3', br.read()); // Start line 3 assertEquals('3', br.getLastChar()); - assertEquals(3, br.getCurrentLineNumber()); + assertEquals(3, br.getLineNumber()); - assertEquals('\n', br.lookAhead()); - assertEquals(3, br.getCurrentLineNumber()); + assertEquals('\n', br.peek()); + assertEquals(3, br.getLineNumber()); assertEquals('3', br.getLastChar()); assertEquals('\n', br.read()); - assertEquals(3, br.getCurrentLineNumber()); + assertEquals(3, br.getLineNumber()); assertEquals('\n', br.getLastChar()); - assertEquals(3, br.getCurrentLineNumber()); + assertEquals(3, br.getLineNumber()); - assertEquals(END_OF_STREAM, br.lookAhead()); + assertEquals(EOF, br.peek()); assertEquals('\n', br.getLastChar()); - assertEquals(END_OF_STREAM, br.read()); - assertEquals(END_OF_STREAM, br.getLastChar()); - assertEquals(END_OF_STREAM, br.read()); - assertEquals(END_OF_STREAM, br.lookAhead()); - assertEquals(3, br.getCurrentLineNumber()); + assertEquals(EOF, br.read()); + assertEquals(EOF, br.getLastChar()); + assertEquals(EOF, br.read()); + assertEquals(EOF, br.peek()); + assertEquals(3, br.getLineNumber()); } } @Test - public void testReadLookahead2() throws Exception { + void testReadLookahead2() throws Exception { final char[] ref = new char[5]; final char[] res = new char[5]; - try (final ExtendedBufferedReader br = createBufferedReader("abcdefg")) { + try (ExtendedBufferedReader br = createBufferedReader("abcdefg")) { ref[0] = 'a'; ref[1] = 'b'; ref[2] = 'c'; @@ -117,96 +233,11 @@ public void testReadLookahead2() throws Exception { assertArrayEquals(ref, res); assertEquals('c', br.getLastChar()); - assertEquals('d', br.lookAhead()); + assertEquals('d', br.peek()); ref[4] = 'd'; assertEquals(1, br.read(res, 4, 1)); assertArrayEquals(ref, res); assertEquals('d', br.getLastChar()); } } - - @Test - public void testReadLine() throws Exception { - try (final ExtendedBufferedReader br = createBufferedReader("")) { - assertNull(br.readLine()); - } - try (final ExtendedBufferedReader br = createBufferedReader("\n")) { - assertEquals("", br.readLine()); - assertNull(br.readLine()); - } - try (final ExtendedBufferedReader br = createBufferedReader("foo\n\nhello")) { - assertEquals(0, br.getCurrentLineNumber()); - assertEquals("foo", br.readLine()); - assertEquals(1, br.getCurrentLineNumber()); - assertEquals("", br.readLine()); - assertEquals(2, br.getCurrentLineNumber()); - assertEquals("hello", br.readLine()); - assertEquals(3, br.getCurrentLineNumber()); - assertNull(br.readLine()); - assertEquals(3, br.getCurrentLineNumber()); - } - try (final ExtendedBufferedReader br = createBufferedReader("foo\n\nhello")) { - assertEquals('f', br.read()); - assertEquals('o', br.lookAhead()); - assertEquals("oo", br.readLine()); - assertEquals(1, br.getCurrentLineNumber()); - assertEquals('\n', br.lookAhead()); - assertEquals("", br.readLine()); - assertEquals(2, br.getCurrentLineNumber()); - assertEquals('h', br.lookAhead()); - assertEquals("hello", br.readLine()); - assertNull(br.readLine()); - assertEquals(3, br.getCurrentLineNumber()); - } - try (final ExtendedBufferedReader br = createBufferedReader("foo\rbaar\r\nfoo")) { - assertEquals("foo", br.readLine()); - assertEquals('b', br.lookAhead()); - assertEquals("baar", br.readLine()); - assertEquals('f', br.lookAhead()); - assertEquals("foo", br.readLine()); - assertNull(br.readLine()); - } - } - - /* - * Test to illustrate https://issues.apache.org/jira/browse/CSV-75 - * - */ - @Test - public void testReadChar() throws Exception { - final String LF = "\n"; - final String CR = "\r"; - final String CRLF = CR + LF; - final String LFCR = LF + CR;// easier to read the string below - final String test = "a" + LF + "b" + CR + "c" + LF + LF + "d" + CR + CR + "e" + LFCR + "f " + CRLF; - // EOL eol EOL EOL eol eol EOL+CR EOL - final int EOLeolct = 9; - - try (final ExtendedBufferedReader br = createBufferedReader(test)) { - assertEquals(0, br.getCurrentLineNumber()); - while (br.readLine() != null) { - // consume all - } - assertEquals(EOLeolct, br.getCurrentLineNumber()); - } - try (final ExtendedBufferedReader br = createBufferedReader(test)) { - assertEquals(0, br.getCurrentLineNumber()); - while (br.read() != -1) { - // consume all - } - assertEquals(EOLeolct, br.getCurrentLineNumber()); - } - try (final ExtendedBufferedReader br = createBufferedReader(test)) { - assertEquals(0, br.getCurrentLineNumber()); - final char[] buff = new char[10]; - while (br.read(buff, 0, 3) != -1) { - // consume all - } - assertEquals(EOLeolct, br.getCurrentLineNumber()); - } - } - - private ExtendedBufferedReader createBufferedReader(final String s) { - return new ExtendedBufferedReader(new StringReader(s)); - } } diff --git a/src/test/java/org/apache/commons/csv/JiraCsv196Test.java b/src/test/java/org/apache/commons/csv/JiraCsv196Test.java new file mode 100644 index 0000000000..aaf8e206b3 --- /dev/null +++ b/src/test/java/org/apache/commons/csv/JiraCsv196Test.java @@ -0,0 +1,76 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * https://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.commons.csv; + +import static org.junit.jupiter.api.Assertions.assertEquals; + +import java.io.IOException; +import java.io.InputStreamReader; +import java.io.Reader; +import java.nio.charset.StandardCharsets; + +import org.junit.jupiter.api.Test; + +class JiraCsv196Test { + + private Reader getTestInput(final String path) { + return new InputStreamReader(ClassLoader.getSystemClassLoader().getResourceAsStream(path)); + } + + @Test + void testParseFourBytes() throws IOException { + final CSVFormat format = CSVFormat.Builder.create().setDelimiter(',').setQuote('\'').get(); + // @formatter:off + try (@SuppressWarnings("resource") // parser closes the reader. + CSVParser parser = new CSVParser.Builder() + .setFormat(format) + .setReader(getTestInput("org/apache/commons/csv/CSV-196/emoji.csv")) + .setCharset(StandardCharsets.UTF_8) + .setTrackBytes(true) + .get()) { + // @formatter:on + final long[] charByteKey = { 0, 84, 701, 1318, 1935 }; + int idx = 0; + for (final CSVRecord record : parser) { + assertEquals(charByteKey[idx++], record.getBytePosition(), "At index " + idx); + } + } + } + + @Test + void testParseThreeBytes() throws IOException { + final CSVFormat format = CSVFormat.Builder.create().setDelimiter(',').setQuote('\'').get(); + // @formatter:off + try (@SuppressWarnings("resource") // parser closes the reader. + CSVParser parser = new CSVParser.Builder() + .setFormat(format) + .setReader(getTestInput("org/apache/commons/csv/CSV-196/japanese.csv")) + .setCharset(StandardCharsets.UTF_8) + .setTrackBytes(true) + .get()) { + // @formatter:on + final long[] charByteKey = { 0, 89, 242, 395 }; + int idx = 0; + for (final CSVRecord record : parser) { + assertEquals(charByteKey[idx++], record.getBytePosition(), "At index " + idx); + } + } + } +} diff --git a/src/test/java/org/apache/commons/csv/JiraCsv318Test.java b/src/test/java/org/apache/commons/csv/JiraCsv318Test.java new file mode 100644 index 0000000000..984509e87d --- /dev/null +++ b/src/test/java/org/apache/commons/csv/JiraCsv318Test.java @@ -0,0 +1,125 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * https://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.commons.csv; + +import static org.junit.jupiter.api.Assertions.assertEquals; + +import java.io.ByteArrayOutputStream; +import java.io.IOException; +import java.io.PrintWriter; +import java.util.ArrayList; +import java.util.List; +import java.util.stream.Stream; + +import org.apache.commons.io.function.IOConsumer; +import org.apache.commons.io.function.IOStream; +import org.apache.commons.lang3.ArrayUtils; +import org.junit.jupiter.api.Test; + +/** + * Tests https://issues.apache.org/jira/projects/CSV/issues/CSV-318?filter=allopenissues + * + * @see CSVPrinter + */ +class JiraCsv318Test { + + private void checkOutput(final ByteArrayOutputStream baos) { + checkOutput(baos.toString()); + } + + private void checkOutput(final String string) { + assertEquals("col a,col b,col c", string.trim()); + } + + private Stream newParallelStream() { + // returned stream is intermediate + return newStream().parallel(); + } + + private CSVPrinter newPrinter(final ByteArrayOutputStream baos) throws IOException { + return new CSVPrinter(new PrintWriter(baos), CSVFormat.DEFAULT); + } + + private Stream newSequentialStream() { + // returned stream is intermediate + return newStream().sequential(); + } + + private Stream newStream() { + return Stream.of("col a", "col b", "col c"); + } + + @Test + void testDefaultStream() throws IOException { + final ByteArrayOutputStream baos = new ByteArrayOutputStream(); + try (CSVPrinter printer = newPrinter(baos)) { + printer.printRecord(newStream()); + } + checkOutput(baos); + } + + @SuppressWarnings("resource") + @Test + void testParallelIOStream() throws IOException { + final ByteArrayOutputStream baos = new ByteArrayOutputStream(); + try (CSVPrinter printer = newPrinter(baos)) { + IOStream.adapt(newParallelStream()).forEachOrdered(printer::print); + } + // No EOR marker in this test intentionally, so checkOutput will trim. + checkOutput(baos); + } + + @SuppressWarnings("resource") + @Test + void testParallelIOStreamSynchronizedPrinterNotUsed() throws IOException { + final ByteArrayOutputStream baos = new ByteArrayOutputStream(); + try (CSVPrinter printer = newPrinter(baos)) { + synchronized (printer) { + IOStream.adapt(newParallelStream()).forEachOrdered(IOConsumer.noop()); + } + } + final List list = new ArrayList<>(); + try (CSVPrinter printer = newPrinter(baos)) { + synchronized (printer) { + IOStream.adapt(newParallelStream()).forEachOrdered(list::add); + } + } + // No EOR marker in this test intentionally, so checkOutput will trim. + checkOutput(String.join(",", list.toArray(ArrayUtils.EMPTY_STRING_ARRAY))); + } + + @Test + void testParallelStream() throws IOException { + final ByteArrayOutputStream baos = new ByteArrayOutputStream(); + try (CSVPrinter printer = newPrinter(baos)) { + printer.printRecord(newParallelStream()); + } + checkOutput(baos); + } + + @Test + void testSequentialStream() throws IOException { + final ByteArrayOutputStream baos = new ByteArrayOutputStream(); + try (CSVPrinter printer = newPrinter(baos)) { + printer.printRecord(newSequentialStream()); + } + checkOutput(baos); + } +} diff --git a/src/test/java/org/apache/commons/csv/LexerTest.java b/src/test/java/org/apache/commons/csv/LexerTest.java index 26fa843a3e..a76f6e513b 100644 --- a/src/test/java/org/apache/commons/csv/LexerTest.java +++ b/src/test/java/org/apache/commons/csv/LexerTest.java @@ -1,18 +1,20 @@ /* - * Licensed to the Apache Software Foundation (ASF) under one or more - * contributor license agreements. See the NOTICE file distributed with - * this work for additional information regarding copyright ownership. - * The ASF licenses this file to You under the Apache License, Version 2.0 - * (the "License"); you may not use this file except in compliance with - * the License. You may obtain a copy of the License at + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at * - * http://www.apache.org/licenses/LICENSE-2.0 + * https://www.apache.org/licenses/LICENSE-2.0 * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. */ package org.apache.commons.csv; @@ -26,108 +28,128 @@ import static org.apache.commons.csv.Token.Type.EOF; import static org.apache.commons.csv.Token.Type.EORECORD; import static org.apache.commons.csv.Token.Type.TOKEN; -import static org.apache.commons.csv.TokenMatchers.hasContent; -import static org.apache.commons.csv.TokenMatchers.matches; -import static org.junit.Assert.assertFalse; -import static org.junit.Assert.assertThat; -import static org.junit.Assert.assertTrue; +import static org.junit.jupiter.api.Assertions.assertEquals; +import static org.junit.jupiter.api.Assertions.assertFalse; +import static org.junit.jupiter.api.Assertions.assertThrows; +import static org.junit.jupiter.api.Assertions.assertTrue; import java.io.IOException; import java.io.StringReader; -import org.junit.Before; -import org.junit.Test; +import org.junit.jupiter.api.BeforeEach; +import org.junit.jupiter.api.Test; /** - * - * - * @version $Id$ */ -public class LexerTest { +class LexerTest { - private CSVFormat formatWithEscaping; + private static void assertContent(final String expectedContent, final Token actualToken) { + assertEquals(expectedContent, actualToken.content.toString()); + } - @Before - public void setUp() { - formatWithEscaping = CSVFormat.DEFAULT.withEscape('\\'); + private static void assertNextToken(final String expectedContent, final Lexer lexer) throws IOException { + assertContent(expectedContent, lexer.nextToken(new Token())); + } + + private static void assertNextToken(final Token.Type expectedType, final String expectedContent, final Lexer lexer) throws IOException { + final Token actualToken = lexer.nextToken(new Token()); + assertEquals(expectedType, actualToken.type); + assertContent(expectedContent, actualToken); } + private CSVFormat formatWithEscaping; + + @SuppressWarnings("resource") private Lexer createLexer(final String input, final CSVFormat format) { return new Lexer(format, new ExtendedBufferedReader(new StringReader(input))); } + @BeforeEach + public void setUp() { + formatWithEscaping = CSVFormat.DEFAULT.withEscape('\\'); + } + + // simple token with escaping enabled @Test - public void testSurroundingSpacesAreDeleted() throws IOException { - final String code = "noSpaces, leadingSpaces,trailingSpaces , surroundingSpaces , ,,"; - try (final Lexer parser = createLexer(code, CSVFormat.DEFAULT.withIgnoreSurroundingSpaces())) { - assertThat(parser.nextToken(new Token()), matches(TOKEN, "noSpaces")); - assertThat(parser.nextToken(new Token()), matches(TOKEN, "leadingSpaces")); - assertThat(parser.nextToken(new Token()), matches(TOKEN, "trailingSpaces")); - assertThat(parser.nextToken(new Token()), matches(TOKEN, "surroundingSpaces")); - assertThat(parser.nextToken(new Token()), matches(TOKEN, "")); - assertThat(parser.nextToken(new Token()), matches(TOKEN, "")); - assertThat(parser.nextToken(new Token()), matches(EOF, "")); + void testBackslashWithEscaping() throws IOException { + /* + * file: a,\,,b \,, + */ + final String code = "a,\\,,b\\\\\n\\,,\\\nc,d\\\r\ne"; + final CSVFormat format = formatWithEscaping.withIgnoreEmptyLines(false); + assertTrue(format.isEscapeCharacterSet()); + try (Lexer lexer = createLexer(code, format)) { + assertNextToken(TOKEN, "a", lexer); + assertNextToken(TOKEN, ",", lexer); + assertNextToken(EORECORD, "b\\", lexer); + assertNextToken(TOKEN, ",", lexer); + assertNextToken(TOKEN, "\nc", lexer); + assertNextToken(EORECORD, "d\r", lexer); + assertNextToken(EOF, "e", lexer); } } + // simple token with escaping not enabled @Test - public void testSurroundingTabsAreDeleted() throws IOException { - final String code = "noTabs,\tleadingTab,trailingTab\t,\tsurroundingTabs\t,\t\t,,"; - try (final Lexer parser = createLexer(code, CSVFormat.DEFAULT.withIgnoreSurroundingSpaces())) { - assertThat(parser.nextToken(new Token()), matches(TOKEN, "noTabs")); - assertThat(parser.nextToken(new Token()), matches(TOKEN, "leadingTab")); - assertThat(parser.nextToken(new Token()), matches(TOKEN, "trailingTab")); - assertThat(parser.nextToken(new Token()), matches(TOKEN, "surroundingTabs")); - assertThat(parser.nextToken(new Token()), matches(TOKEN, "")); - assertThat(parser.nextToken(new Token()), matches(TOKEN, "")); - assertThat(parser.nextToken(new Token()), matches(EOF, "")); + void testBackslashWithoutEscaping() throws IOException { + /* + * file: a,\,,b \,, + */ + final String code = "a,\\,,b\\\n\\,,"; + final CSVFormat format = CSVFormat.DEFAULT; + assertFalse(format.isEscapeCharacterSet()); + try (Lexer lexer = createLexer(code, format)) { + // parser.nextToken(new Token()) + assertNextToken(TOKEN, "a", lexer); + // an unquoted single backslash is not an escape char + assertNextToken(TOKEN, "\\", lexer); + assertNextToken(TOKEN, "", lexer); + assertNextToken(EORECORD, "b\\", lexer); + // an unquoted single backslash is not an escape char + assertNextToken(TOKEN, "\\", lexer); + assertNextToken(TOKEN, "", lexer); + assertNextToken(EOF, "", lexer); } } @Test - public void testIgnoreEmptyLines() throws IOException { - final String code = "first,line,\n" + "\n" + "\n" + "second,line\n" + "\n" + "\n" + "third line \n" + "\n" + - "\n" + "last, line \n" + "\n" + "\n" + "\n"; - final CSVFormat format = CSVFormat.DEFAULT.withIgnoreEmptyLines(); - try (final Lexer parser = createLexer(code, format)) { - assertThat(parser.nextToken(new Token()), matches(TOKEN, "first")); - assertThat(parser.nextToken(new Token()), matches(TOKEN, "line")); - assertThat(parser.nextToken(new Token()), matches(EORECORD, "")); - assertThat(parser.nextToken(new Token()), matches(TOKEN, "second")); - assertThat(parser.nextToken(new Token()), matches(EORECORD, "line")); - assertThat(parser.nextToken(new Token()), matches(EORECORD, "third line ")); - assertThat(parser.nextToken(new Token()), matches(TOKEN, "last")); - assertThat(parser.nextToken(new Token()), matches(EORECORD, " line ")); - assertThat(parser.nextToken(new Token()), matches(EOF, "")); - assertThat(parser.nextToken(new Token()), matches(EOF, "")); + void testBackspace() throws Exception { + try (Lexer lexer = createLexer("character" + BACKSPACE + "NotEscaped", formatWithEscaping)) { + assertNextToken("character" + BACKSPACE + "NotEscaped", lexer); } } @Test - public void testComments() throws IOException { - final String code = "first,line,\n" + "second,line,tokenWith#no-comment\n" + "# comment line \n" + - "third,line,#no-comment\n" + "# penultimate comment\n" + "# Final comment\n"; + void testComments() throws IOException { + // @formatter:off + final String code = "first,line,\n" + + "second,line,tokenWith#no-comment\n" + + "# comment line \n" + + "third,line,#no-comment\n" + + "# penultimate comment\n" + + "# Final comment\n"; + // @formatter:on final CSVFormat format = CSVFormat.DEFAULT.withCommentMarker('#'); - try (final Lexer parser = createLexer(code, format)) { - assertThat(parser.nextToken(new Token()), matches(TOKEN, "first")); - assertThat(parser.nextToken(new Token()), matches(TOKEN, "line")); - assertThat(parser.nextToken(new Token()), matches(EORECORD, "")); - assertThat(parser.nextToken(new Token()), matches(TOKEN, "second")); - assertThat(parser.nextToken(new Token()), matches(TOKEN, "line")); - assertThat(parser.nextToken(new Token()), matches(EORECORD, "tokenWith#no-comment")); - assertThat(parser.nextToken(new Token()), matches(COMMENT, "comment line")); - assertThat(parser.nextToken(new Token()), matches(TOKEN, "third")); - assertThat(parser.nextToken(new Token()), matches(TOKEN, "line")); - assertThat(parser.nextToken(new Token()), matches(EORECORD, "#no-comment")); - assertThat(parser.nextToken(new Token()), matches(COMMENT, "penultimate comment")); - assertThat(parser.nextToken(new Token()), matches(COMMENT, "Final comment")); - assertThat(parser.nextToken(new Token()), matches(EOF, "")); - assertThat(parser.nextToken(new Token()), matches(EOF, "")); - } - } - - @Test - public void testCommentsAndEmptyLines() throws IOException { + try (Lexer lexer = createLexer(code, format)) { + assertNextToken(TOKEN, "first", lexer); + assertNextToken(TOKEN, "line", lexer); + assertNextToken(EORECORD, "", lexer); + assertNextToken(TOKEN, "second", lexer); + assertNextToken(TOKEN, "line", lexer); + assertNextToken(EORECORD, "tokenWith#no-comment", lexer); + assertNextToken(COMMENT, "comment line", lexer); + assertNextToken(TOKEN, "third", lexer); + assertNextToken(TOKEN, "line", lexer); + assertNextToken(EORECORD, "#no-comment", lexer); + assertNextToken(COMMENT, "penultimate comment", lexer); + assertNextToken(COMMENT, "Final comment", lexer); + assertNextToken(EOF, "", lexer); + assertNextToken(EOF, "", lexer); + } + } + + @Test + void testCommentsAndEmptyLines() throws IOException { final String code = "1,2,3,\n" + // 1 "\n" + // 1b "\n" + // 1c @@ -143,250 +165,386 @@ public void testCommentsAndEmptyLines() throws IOException { "\n" + // 6c "# Final comment\n"; // 7 final CSVFormat format = CSVFormat.DEFAULT.withCommentMarker('#').withIgnoreEmptyLines(false); - assertFalse("Should not ignore empty lines", format.getIgnoreEmptyLines()); - - try (final Lexer parser = createLexer(code, format)) { - assertThat(parser.nextToken(new Token()), matches(TOKEN, "1")); - assertThat(parser.nextToken(new Token()), matches(TOKEN, "2")); - assertThat(parser.nextToken(new Token()), matches(TOKEN, "3")); - assertThat(parser.nextToken(new Token()), matches(EORECORD, "")); // 1 - assertThat(parser.nextToken(new Token()), matches(EORECORD, "")); // 1b - assertThat(parser.nextToken(new Token()), matches(EORECORD, "")); // 1c - assertThat(parser.nextToken(new Token()), matches(TOKEN, "a")); - assertThat(parser.nextToken(new Token()), matches(TOKEN, "b x")); - assertThat(parser.nextToken(new Token()), matches(EORECORD, "c#no-comment")); // 2 - assertThat(parser.nextToken(new Token()), matches(COMMENT, "foo")); // 3 - assertThat(parser.nextToken(new Token()), matches(EORECORD, "")); // 4 - assertThat(parser.nextToken(new Token()), matches(EORECORD, "")); // 4b - assertThat(parser.nextToken(new Token()), matches(TOKEN, "d")); - assertThat(parser.nextToken(new Token()), matches(TOKEN, "e")); - assertThat(parser.nextToken(new Token()), matches(EORECORD, "#no-comment")); // 5 - assertThat(parser.nextToken(new Token()), matches(EORECORD, "")); // 5b - assertThat(parser.nextToken(new Token()), matches(EORECORD, "")); // 5c - assertThat(parser.nextToken(new Token()), matches(COMMENT, "penultimate comment")); // 6 - assertThat(parser.nextToken(new Token()), matches(EORECORD, "")); // 6b - assertThat(parser.nextToken(new Token()), matches(EORECORD, "")); // 6c - assertThat(parser.nextToken(new Token()), matches(COMMENT, "Final comment")); // 7 - assertThat(parser.nextToken(new Token()), matches(EOF, "")); - assertThat(parser.nextToken(new Token()), matches(EOF, "")); + assertFalse(format.getIgnoreEmptyLines(), "Should not ignore empty lines"); + + try (Lexer lexer = createLexer(code, format)) { + assertNextToken(TOKEN, "1", lexer); + assertNextToken(TOKEN, "2", lexer); + assertNextToken(TOKEN, "3", lexer); + assertNextToken(EORECORD, "", lexer); // 1 + assertNextToken(EORECORD, "", lexer); // 1b + assertNextToken(EORECORD, "", lexer); // 1c + assertNextToken(TOKEN, "a", lexer); + assertNextToken(TOKEN, "b x", lexer); + assertNextToken(EORECORD, "c#no-comment", lexer); // 2 + assertNextToken(COMMENT, "foo", lexer); // 3 + assertNextToken(EORECORD, "", lexer); // 4 + assertNextToken(EORECORD, "", lexer); // 4b + assertNextToken(TOKEN, "d", lexer); + assertNextToken(TOKEN, "e", lexer); + assertNextToken(EORECORD, "#no-comment", lexer); // 5 + assertNextToken(EORECORD, "", lexer); // 5b + assertNextToken(EORECORD, "", lexer); // 5c + assertNextToken(COMMENT, "penultimate comment", lexer); // 6 + assertNextToken(EORECORD, "", lexer); // 6b + assertNextToken(EORECORD, "", lexer); // 6c + assertNextToken(COMMENT, "Final comment", lexer); // 7 + assertNextToken(EOF, "", lexer); + assertNextToken(EOF, "", lexer); } } - // simple token with escaping not enabled @Test - public void testBackslashWithoutEscaping() throws IOException { - /* - * file: a,\,,b \,, - */ - final String code = "a,\\,,b\\\n\\,,"; - final CSVFormat format = CSVFormat.DEFAULT; - assertFalse(format.isEscapeCharacterSet()); - try (final Lexer parser = createLexer(code, format)) { - assertThat(parser.nextToken(new Token()), matches(TOKEN, "a")); - // an unquoted single backslash is not an escape char - assertThat(parser.nextToken(new Token()), matches(TOKEN, "\\")); - assertThat(parser.nextToken(new Token()), matches(TOKEN, "")); - assertThat(parser.nextToken(new Token()), matches(EORECORD, "b\\")); - // an unquoted single backslash is not an escape char - assertThat(parser.nextToken(new Token()), matches(TOKEN, "\\")); - assertThat(parser.nextToken(new Token()), matches(TOKEN, "")); - assertThat(parser.nextToken(new Token()), matches(EOF, "")); + void testCR() throws Exception { + try (Lexer lexer = createLexer("character" + CR + "NotEscaped", formatWithEscaping)) { + assertNextToken("character", lexer); + assertNextToken("NotEscaped", lexer); } } - // simple token with escaping enabled + // From CSV-1 @Test - public void testBackslashWithEscaping() throws IOException { - /* - * file: a,\,,b \,, - */ - final String code = "a,\\,,b\\\\\n\\,,\\\nc,d\\\r\ne"; - final CSVFormat format = formatWithEscaping.withIgnoreEmptyLines(false); - assertTrue(format.isEscapeCharacterSet()); - try (final Lexer parser = createLexer(code, format)) { - assertThat(parser.nextToken(new Token()), matches(TOKEN, "a")); - assertThat(parser.nextToken(new Token()), matches(TOKEN, ",")); - assertThat(parser.nextToken(new Token()), matches(EORECORD, "b\\")); - assertThat(parser.nextToken(new Token()), matches(TOKEN, ",")); - assertThat(parser.nextToken(new Token()), matches(TOKEN, "\nc")); - assertThat(parser.nextToken(new Token()), matches(EORECORD, "d\r")); - assertThat(parser.nextToken(new Token()), matches(EOF, "e")); + void testDelimiterIsWhitespace() throws IOException { + final String code = "one\ttwo\t\tfour \t five\t six"; + try (Lexer lexer = createLexer(code, CSVFormat.TDF)) { + assertNextToken(TOKEN, "one", lexer); + assertNextToken(TOKEN, "two", lexer); + assertNextToken(TOKEN, "", lexer); + assertNextToken(TOKEN, "four", lexer); + assertNextToken(TOKEN, "five", lexer); + assertNextToken(EOF, "six", lexer); } } - // encapsulator tokenizer (single line) + /** + * With {@code ignoreSurroundingSpaces} enabled and a multi-character delimiter whose first character is whitespace, + * the side-effecting {@link Lexer#isDelimiter(int)} must only be evaluated once per character, otherwise the + * delimiter is consumed in the whitespace-skip loop and the empty field at the boundary is dropped. + */ @Test - public void testNextToken4() throws IOException { - /* - * file: a,"foo",b a, " foo",b a,"foo " ,b // whitespace after closing encapsulator a, " foo " ,b - */ - final String code = "a,\"foo\",b\na, \" foo\",b\na,\"foo \" ,b\na, \" foo \" ,b"; - try (final Lexer parser = createLexer(code, CSVFormat.DEFAULT.withIgnoreSurroundingSpaces())) { - assertThat(parser.nextToken(new Token()), matches(TOKEN, "a")); - assertThat(parser.nextToken(new Token()), matches(TOKEN, "foo")); - assertThat(parser.nextToken(new Token()), matches(EORECORD, "b")); - assertThat(parser.nextToken(new Token()), matches(TOKEN, "a")); - assertThat(parser.nextToken(new Token()), matches(TOKEN, " foo")); - assertThat(parser.nextToken(new Token()), matches(EORECORD, "b")); - assertThat(parser.nextToken(new Token()), matches(TOKEN, "a")); - assertThat(parser.nextToken(new Token()), matches(TOKEN, "foo ")); - assertThat(parser.nextToken(new Token()), matches(EORECORD, "b")); - assertThat(parser.nextToken(new Token()), matches(TOKEN, "a")); - assertThat(parser.nextToken(new Token()), matches(TOKEN, " foo ")); - // assertTokenEquals(EORECORD, "b", parser.nextToken(new Token())); - assertThat(parser.nextToken(new Token()), matches(EOF, "b")); + void testEmptyTokenBeforeWhitespacePrefixedMultiCharacterDelimiter() throws IOException { + final CSVFormat format = CSVFormat.DEFAULT.builder().setDelimiter(" |").setIgnoreSurroundingSpaces(true).get(); + try (Lexer lexer = createLexer(" |a", format)) { + assertNextToken(TOKEN, "", lexer); + assertNextToken(EOF, "a", lexer); + } + try (Lexer lexer = createLexer("a | |b", format)) { + assertNextToken(TOKEN, "a", lexer); + assertNextToken(TOKEN, "", lexer); + assertNextToken(EOF, "b", lexer); } } - // encapsulator tokenizer (multi line, delimiter in string) @Test - public void testNextToken5() throws IOException { - final String code = "a,\"foo\n\",b\n\"foo\n baar ,,,\"\n\"\n\t \n\""; - try (final Lexer parser = createLexer(code, CSVFormat.DEFAULT)) { - assertThat(parser.nextToken(new Token()), matches(TOKEN, "a")); - assertThat(parser.nextToken(new Token()), matches(TOKEN, "foo\n")); - assertThat(parser.nextToken(new Token()), matches(EORECORD, "b")); - assertThat(parser.nextToken(new Token()), matches(EORECORD, "foo\n baar ,,,")); - assertThat(parser.nextToken(new Token()), matches(EOF, "\n\t \n")); + void testEOFWithoutClosingQuote() throws Exception { + final String code = "a,\"b"; + try (Lexer lexer = createLexer(code, CSVFormat.Builder.create().setLenientEof(true).get())) { + assertNextToken(TOKEN, "a", lexer); + assertNextToken(EOF, "b", lexer); + } + try (Lexer lexer = createLexer(code, CSVFormat.Builder.create().setLenientEof(false).get())) { + assertNextToken(TOKEN, "a", lexer); + assertThrows(IOException.class, () -> lexer.nextToken(new Token())); + } + } + + @Test // TODO is this correct? Do we expect BACKSPACE to be unescaped? + void testEscapedBackspace() throws Exception { + try (Lexer lexer = createLexer("character\\" + BACKSPACE + "Escaped", formatWithEscaping)) { + assertNextToken("character" + BACKSPACE + "Escaped", lexer); } } - // change delimiters, comment, encapsulater @Test - public void testNextToken6() throws IOException { - /* - * file: a;'b and \' more ' !comment;;;; ;; - */ - final String code = "a;'b and '' more\n'\n!comment;;;;\n;;"; - final CSVFormat format = CSVFormat.DEFAULT.withQuote('\'').withCommentMarker('!').withDelimiter(';'); - try (final Lexer parser = createLexer(code, format)) { - assertThat(parser.nextToken(new Token()), matches(TOKEN, "a")); - assertThat(parser.nextToken(new Token()), matches(EORECORD, "b and ' more\n")); + void testEscapedCharacter() throws Exception { + try (Lexer lexer = createLexer("character\\aEscaped", formatWithEscaping)) { + assertNextToken("character\\aEscaped", lexer); } } - // From CSV-1 @Test - public void testDelimiterIsWhitespace() throws IOException { - final String code = "one\ttwo\t\tfour \t five\t six"; - try (final Lexer parser = createLexer(code, CSVFormat.TDF)) { - assertThat(parser.nextToken(new Token()), matches(TOKEN, "one")); - assertThat(parser.nextToken(new Token()), matches(TOKEN, "two")); - assertThat(parser.nextToken(new Token()), matches(TOKEN, "")); - assertThat(parser.nextToken(new Token()), matches(TOKEN, "four")); - assertThat(parser.nextToken(new Token()), matches(TOKEN, "five")); - assertThat(parser.nextToken(new Token()), matches(EOF, "six")); + void testEscapedControlCharacter() throws Exception { + // we are explicitly using an escape different from \ here + try (Lexer lexer = createLexer("character!rEscaped", CSVFormat.DEFAULT.withEscape('!'))) { + assertNextToken("character" + CR + "Escaped", lexer); } } @Test - public void testEscapedCR() throws Exception { - try (final Lexer lexer = createLexer("character\\" + CR + "Escaped", formatWithEscaping)) { - assertThat(lexer.nextToken(new Token()), hasContent("character" + CR + "Escaped")); + void testEscapedControlCharacter2() throws Exception { + try (Lexer lexer = createLexer("character\\rEscaped", CSVFormat.DEFAULT.withEscape('\\'))) { + assertNextToken("character" + CR + "Escaped", lexer); } } @Test - public void testCR() throws Exception { - try (final Lexer lexer = createLexer("character" + CR + "NotEscaped", formatWithEscaping)) { - assertThat(lexer.nextToken(new Token()), hasContent("character")); - assertThat(lexer.nextToken(new Token()), hasContent("NotEscaped")); + void testEscapedCR() throws Exception { + try (Lexer lexer = createLexer("character\\" + CR + "Escaped", formatWithEscaping)) { + assertNextToken("character" + CR + "Escaped", lexer); + } + } + + @Test // TODO is this correct? Do we expect FF to be unescaped? + void testEscapedFF() throws Exception { + try (Lexer lexer = createLexer("character\\" + FF + "Escaped", formatWithEscaping)) { + assertNextToken("character" + FF + "Escaped", lexer); } } @Test - public void testEscapedLF() throws Exception { - try (final Lexer lexer = createLexer("character\\" + LF + "Escaped", formatWithEscaping)) { - assertThat(lexer.nextToken(new Token()), hasContent("character" + LF + "Escaped")); + void testEscapedLF() throws Exception { + try (Lexer lexer = createLexer("character\\" + LF + "Escaped", formatWithEscaping)) { + assertNextToken("character" + LF + "Escaped", lexer); } } @Test - public void testLF() throws Exception { - try (final Lexer lexer = createLexer("character" + LF + "NotEscaped", formatWithEscaping)) { - assertThat(lexer.nextToken(new Token()), hasContent("character")); - assertThat(lexer.nextToken(new Token()), hasContent("NotEscaped")); + void testEscapedMySqlNullValue() throws Exception { + // MySQL uses \N to symbolize null values. We have to restore this + try (Lexer lexer = createLexer("character\\NEscaped", formatWithEscaping)) { + assertNextToken("character\\NEscaped", lexer); } } @Test // TODO is this correct? Do we expect TAB to be unescaped? - public void testEscapedTab() throws Exception { - try (final Lexer lexer = createLexer("character\\" + TAB + "Escaped", formatWithEscaping)) { - assertThat(lexer.nextToken(new Token()), hasContent("character" + TAB + "Escaped")); + void testEscapedTab() throws Exception { + try (Lexer lexer = createLexer("character\\" + TAB + "Escaped", formatWithEscaping)) { + assertNextToken("character" + TAB + "Escaped", lexer); } } @Test - public void testTab() throws Exception { - try (final Lexer lexer = createLexer("character" + TAB + "NotEscaped", formatWithEscaping)) { - assertThat(lexer.nextToken(new Token()), hasContent("character" + TAB + "NotEscaped")); + void testEscapingAtEOF() throws Exception { + final String code = "escaping at EOF is evil\\"; + try (Lexer lexer = createLexer(code, formatWithEscaping)) { + assertThrows(IOException.class, () -> lexer.nextToken(new Token())); } } - @Test // TODO is this correct? Do we expect BACKSPACE to be unescaped? - public void testEscapedBackspace() throws Exception { - try (final Lexer lexer = createLexer("character\\" + BACKSPACE + "Escaped", formatWithEscaping)) { - assertThat(lexer.nextToken(new Token()), hasContent("character" + BACKSPACE + "Escaped")); + @Test + void testFF() throws Exception { + try (Lexer lexer = createLexer("character" + FF + "NotEscaped", formatWithEscaping)) { + assertNextToken("character" + FF + "NotEscaped", lexer); } } @Test - public void testBackspace() throws Exception { - try (final Lexer lexer = createLexer("character" + BACKSPACE + "NotEscaped", formatWithEscaping)) { - assertThat(lexer.nextToken(new Token()), hasContent("character" + BACKSPACE + "NotEscaped")); + void testIgnoreEmptyLines() throws IOException { + // @formatter:off + final String code = "first,line,\n" + + "\n" + + "\n" + + "second,line\n" + + "\n" + + "\n" + + "third line \n" + + "\n" + + "\n" + + "last, line \n" + + "\n" + + "\n" + + "\n"; + // @formatter:on + final CSVFormat format = CSVFormat.DEFAULT.withIgnoreEmptyLines(); + try (Lexer lexer = createLexer(code, format)) { + assertNextToken(TOKEN, "first", lexer); + assertNextToken(TOKEN, "line", lexer); + assertNextToken(EORECORD, "", lexer); + assertNextToken(TOKEN, "second", lexer); + assertNextToken(EORECORD, "line", lexer); + assertNextToken(EORECORD, "third line ", lexer); + assertNextToken(TOKEN, "last", lexer); + assertNextToken(EORECORD, " line ", lexer); + assertNextToken(EOF, "", lexer); + assertNextToken(EOF, "", lexer); } } - @Test // TODO is this correct? Do we expect FF to be unescaped? - public void testEscapedFF() throws Exception { - try (final Lexer lexer = createLexer("character\\" + FF + "Escaped", formatWithEscaping)) { - assertThat(lexer.nextToken(new Token()), hasContent("character" + FF + "Escaped")); + @Test + void testIsMetaCharCommentStart() throws IOException { + try (Lexer lexer = createLexer("#", CSVFormat.DEFAULT.withCommentMarker('#'))) { + final int ch = lexer.readEscape(); + assertEquals('#', ch); } } @Test - public void testFF() throws Exception { - try (final Lexer lexer = createLexer("character" + FF + "NotEscaped", formatWithEscaping)) { - assertThat(lexer.nextToken(new Token()), hasContent("character" + FF + "NotEscaped")); + void testLF() throws Exception { + try (Lexer lexer = createLexer("character" + LF + "NotEscaped", formatWithEscaping)) { + assertNextToken("character", lexer); + assertNextToken("NotEscaped", lexer); } } + // encapsulator tokenizer (single line) @Test - public void testEscapedMySqlNullValue() throws Exception { - // MySQL uses \N to symbolize null values. We have to restore this - try (final Lexer lexer = createLexer("character\\NEscaped", formatWithEscaping)) { - assertThat(lexer.nextToken(new Token()), hasContent("character\\NEscaped")); + void testNextToken4() throws IOException { + /* + * file: a,"foo",b a, " foo",b a,"foo " ,b // whitespace after closing encapsulator a, " foo " ,b + */ + final String code = "a,\"foo\",b\na, \" foo\",b\na,\"foo \" ,b\na, \" foo \" ,b"; + try (Lexer lexer = createLexer(code, CSVFormat.DEFAULT.withIgnoreSurroundingSpaces())) { + assertNextToken(TOKEN, "a", lexer); + assertNextToken(TOKEN, "foo", lexer); + assertNextToken(EORECORD, "b", lexer); + assertNextToken(TOKEN, "a", lexer); + assertNextToken(TOKEN, " foo", lexer); + assertNextToken(EORECORD, "b", lexer); + assertNextToken(TOKEN, "a", lexer); + assertNextToken(TOKEN, "foo ", lexer); + assertNextToken(EORECORD, "b", lexer); + assertNextToken(TOKEN, "a", lexer); + assertNextToken(TOKEN, " foo ", lexer); + // assertTokenEquals(EORECORD, "b", parser); + assertNextToken(EOF, "b", lexer); } } + // encapsulator tokenizer (multi line, delimiter in string) @Test - public void testEscapedCharacter() throws Exception { - try (final Lexer lexer = createLexer("character\\aEscaped", formatWithEscaping)) { - assertThat(lexer.nextToken(new Token()), hasContent("character\\aEscaped")); + void testNextToken5() throws IOException { + final String code = "a,\"foo\n\",b\n\"foo\n baar ,,,\"\n\"\n\t \n\""; + try (Lexer lexer = createLexer(code, CSVFormat.DEFAULT)) { + assertNextToken(TOKEN, "a", lexer); + assertNextToken(TOKEN, "foo\n", lexer); + assertNextToken(EORECORD, "b", lexer); + assertNextToken(EORECORD, "foo\n baar ,,,", lexer); + assertNextToken(EOF, "\n\t \n", lexer); } } + // change delimiters, comment, encapsulater @Test - public void testEscapedControlCharacter() throws Exception { - // we are explicitly using an escape different from \ here - try (final Lexer lexer = createLexer("character!rEscaped", CSVFormat.DEFAULT.withEscape('!'))) { - assertThat(lexer.nextToken(new Token()), hasContent("character" + CR + "Escaped")); + void testNextToken6() throws IOException { + /* + * file: a;'b and \' more ' !comment;;;; ;; + */ + final String code = "a;'b and '' more\n'\n!comment;;;;\n;;"; + final CSVFormat format = CSVFormat.DEFAULT.withQuote('\'').withCommentMarker('!').withDelimiter(';'); + try (Lexer lexer = createLexer(code, format)) { + assertNextToken(TOKEN, "a", lexer); + assertNextToken(EORECORD, "b and ' more\n", lexer); } } + /** + * A truncated escaped multi-character delimiter at EOF must not be accepted by reusing the previous escape delimiter + * look-ahead in {@link Lexer#isEscapeDelimiter()}. + */ @Test - public void testEscapedControlCharacter2() throws Exception { - try (final Lexer lexer = createLexer("character\\rEscaped", CSVFormat.DEFAULT.withEscape('\\'))) { - assertThat(lexer.nextToken(new Token()), hasContent("character" + CR + "Escaped")); + void testPartialEscapedMultiCharacterDelimiterAtEOF() throws IOException { + final CSVFormat format = CSVFormat.DEFAULT.builder().setDelimiter("[|]").setEscape('!').get(); + try (Lexer lexer = createLexer("x![!|!]y![!|", format)) { + assertNextToken(EOF, "x[|]y![!|", lexer); } } - @Test(expected = IOException.class) - public void testEscapingAtEOF() throws Exception { - final String code = "escaping at EOF is evil\\"; - try (final Lexer lexer = createLexer(code, formatWithEscaping)) { - lexer.nextToken(new Token()); + /** + * Tests CSV-324. + */ + @Test + void testPartialMultiCharacterDelimiterAtEOF() throws IOException { + final CSVFormat format = CSVFormat.DEFAULT.builder().setDelimiter("[|]").get(); + try (Lexer lexer = createLexer("a[|]b[|", format)) { + assertNextToken(TOKEN, "a", lexer); + assertNextToken(EOF, "b[|", lexer); + } + } + + /** + * A truncated multi-character delimiter at EOF must not be accepted by reusing the look-ahead buffer left dirty by an + * earlier non-matching peek in the same token (CSV-324 only cleared the buffer once per token). + */ + @Test + void testPartialMultiCharacterDelimiterAtEOFAfterMismatch() throws IOException { + final CSVFormat format = CSVFormat.DEFAULT.builder().setDelimiter("[|]").get(); + // The "[a]" peek leaves ']' in the look-ahead buffer; the trailing "[|" must not match "[|]". + final String recordString = "x[a][|"; + try (Lexer lexer = createLexer(recordString, format)) { + assertNextToken(EOF, recordString, lexer); + } + } + + @Test + void testReadEscapeBackspace() throws IOException { + try (Lexer lexer = createLexer("b", CSVFormat.DEFAULT.withEscape('\b'))) { + final int ch = lexer.readEscape(); + assertEquals(BACKSPACE, ch); + } + } + + @Test + void testReadEscapeFF() throws IOException { + try (Lexer lexer = createLexer("f", CSVFormat.DEFAULT.withEscape('\f'))) { + final int ch = lexer.readEscape(); + assertEquals(FF, ch); + } + } + + @Test + void testReadEscapeTab() throws IOException { + try (Lexer lexer = createLexer("t", CSVFormat.DEFAULT.withEscape('\t'))) { + final int ch = lexer.readEscape(); + assertNextToken(EOF, "", lexer); + assertEquals(TAB, ch); + } + } + + @Test + void testSurroundingSpacesAreDeleted() throws IOException { + final String code = "noSpaces, leadingSpaces,trailingSpaces , surroundingSpaces , ,,"; + try (Lexer lexer = createLexer(code, CSVFormat.DEFAULT.withIgnoreSurroundingSpaces())) { + assertNextToken(TOKEN, "noSpaces", lexer); + assertNextToken(TOKEN, "leadingSpaces", lexer); + assertNextToken(TOKEN, "trailingSpaces", lexer); + assertNextToken(TOKEN, "surroundingSpaces", lexer); + assertNextToken(TOKEN, "", lexer); + assertNextToken(TOKEN, "", lexer); + assertNextToken(EOF, "", lexer); + } + } + + @Test + void testSurroundingTabsAreDeleted() throws IOException { + final String code = "noTabs,\tleadingTab,trailingTab\t,\tsurroundingTabs\t,\t\t,,"; + try (Lexer lexer = createLexer(code, CSVFormat.DEFAULT.withIgnoreSurroundingSpaces())) { + assertNextToken(TOKEN, "noTabs", lexer); + assertNextToken(TOKEN, "leadingTab", lexer); + assertNextToken(TOKEN, "trailingTab", lexer); + assertNextToken(TOKEN, "surroundingTabs", lexer); + assertNextToken(TOKEN, "", lexer); + assertNextToken(TOKEN, "", lexer); + assertNextToken(EOF, "", lexer); + } + } + + @Test + void testTab() throws Exception { + try (Lexer lexer = createLexer("character" + TAB + "NotEscaped", formatWithEscaping)) { + assertNextToken("character" + TAB + "NotEscaped", lexer); + } + } + + @Test + void testTrailingTextAfterQuote() throws Exception { + final String code = "\"a\" b,\"a\" \" b,\"a\" b \"\""; + try (Lexer lexer = createLexer(code, CSVFormat.Builder.create().setTrailingData(true).get())) { + assertNextToken(TOKEN, "a b", lexer); + assertNextToken(TOKEN, "a \" b", lexer); + assertNextToken(EOF, "a b \"\"", lexer); + } + try (Lexer parser = createLexer(code, CSVFormat.Builder.create().setTrailingData(false).get())) { + assertThrows(IOException.class, () -> parser.nextToken(new Token())); + } + } + + @Test + void testTrimTrailingSpacesZeroLength() throws Exception { + final StringBuilder buffer = new StringBuilder(""); + try (Lexer lexer = createLexer(buffer.toString(), CSVFormat.DEFAULT)) { + lexer.trimTrailingSpaces(buffer); + assertNextToken(EOF, "", lexer); } } } diff --git a/src/test/java/org/apache/commons/csv/PerformanceTest.java b/src/test/java/org/apache/commons/csv/PerformanceTest.java index 226ed96263..9284828e6c 100644 --- a/src/test/java/org/apache/commons/csv/PerformanceTest.java +++ b/src/test/java/org/apache/commons/csv/PerformanceTest.java @@ -1,195 +1,278 @@ /* - * Licensed to the Apache Software Foundation (ASF) under one or more - * contributor license agreements. See the NOTICE file distributed with - * this work for additional information regarding copyright ownership. - * The ASF licenses this file to You under the Apache License, Version 2.0 - * (the "License"); you may not use this file except in compliance with - * the License. You may obtain a copy of the License at + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at * - * http://www.apache.org/licenses/LICENSE-2.0 + * https://www.apache.org/licenses/LICENSE-2.0 * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. */ package org.apache.commons.csv; +import static org.apache.commons.io.IOUtils.EOF; + import java.io.BufferedReader; import java.io.File; import java.io.FileInputStream; import java.io.FileOutputStream; -import java.io.FileReader; import java.io.IOException; import java.io.InputStream; +import java.io.InputStreamReader; import java.io.OutputStream; +import java.io.Reader; import java.lang.reflect.Constructor; import java.lang.reflect.InvocationTargetException; +import java.nio.charset.StandardCharsets; +import java.nio.file.Files; +import java.nio.file.Paths; import java.util.zip.GZIPInputStream; +import org.apache.commons.io.FileUtils; import org.apache.commons.io.IOUtils; /** * Basic test harness. - * - * Requires test file to be downloaded separately. - * - * @version $Id$ */ @SuppressWarnings("boxing") -public class PerformanceTest { +class PerformanceTest { + + @FunctionalInterface + private interface CSVParserFactory { + CSVParser createParser() throws IOException; + } - private static final String[] PROPS = { - "java.version", // Java Runtime Environment version - "java.vendor", // Java Runtime Environment vendor + // Container for basic statistics + private static final class Stats { + final int count; + final int fields; + + Stats(final int c, final int f) { + count = c; + fields = f; + } + } + + private static final String[] PROPERTY_NAMES = { "java.version", // Java Runtime Environment version + "java.vendor", // Java Runtime Environment vendor // "java.vm.specification.version", // Java Virtual Machine specification version // "java.vm.specification.vendor", // Java Virtual Machine specification vendor // "java.vm.specification.name", // Java Virtual Machine specification name - "java.vm.version", // Java Virtual Machine implementation version + "java.vm.version", // Java Virtual Machine implementation version // "java.vm.vendor", // Java Virtual Machine implementation vendor - "java.vm.name", // Java Virtual Machine implementation name + "java.vm.name", // Java Virtual Machine implementation name // "java.specification.version", // Java Runtime Environment specification version // "java.specification.vendor", // Java Runtime Environment specification vendor // "java.specification.name", // Java Runtime Environment specification name - "os.name", // Operating system name - "os.arch", // Operating system architecture - "os.version", // Operating system version - + "os.name", // Operating system name + "os.arch", // Operating system architecture + "os.version", // Operating system version }; + private static int max = 11; // skip first test - private static int max = 10; - - private static int num = 0; // number of elapsed times recorded - private static long[] elapsedTimes = new long[max]; + private static int num; // number of elapsed times recorded + private static final long[] ELAPSED_TIMES = new long[max]; private static final CSVFormat format = CSVFormat.EXCEL; - private static final File BIG_FILE = new File(System.getProperty("java.io.tmpdir"), "worldcitiespop.txt"); + private static final String TEST_RESRC = "org/apache/commons/csv/perf/worldcitiespop.txt.gz"; - public static void main(final String [] args) throws Exception { + private static final File BIG_FILE = new File(FileUtils.getTempDirectoryPath(), "worldcitiespop.txt"); + + private static Reader createReader() throws IOException { + return new InputStreamReader(new FileInputStream(BIG_FILE), StandardCharsets.ISO_8859_1); + } + + private static Lexer createTestCSVLexer(final String test, final ExtendedBufferedReader input) + throws InstantiationException, IllegalAccessException, InvocationTargetException, Exception { + return test.startsWith("CSVLexer") ? getLexerCtor(test).newInstance(format, input) : new Lexer(format, input); + } + + private static Constructor getLexerCtor(final String clazz) throws Exception { + @SuppressWarnings("unchecked") + final Class lexer = (Class) Class.forName("org.apache.commons.csv." + clazz); + return lexer.getConstructor(CSVFormat.class, ExtendedBufferedReader.class); + } + + private static Stats iterate(final Iterable iterable) { + int count = 0; + int fields = 0; + for (final CSVRecord record : iterable) { + count++; + fields += record.size(); + } + return new Stats(count, fields); + } + + public static void main(final String[] args) throws Exception { if (BIG_FILE.exists()) { - System.out.println(String.format("Found test fixture %s: %,d bytes.", BIG_FILE, BIG_FILE.length())); + System.out.printf("Found test fixture %s: %,d bytes.%n", BIG_FILE, BIG_FILE.length()); } else { - System.out.println("Decompressing test fixture " + BIG_FILE + "..."); - try (final InputStream input = new GZIPInputStream( - new FileInputStream("src/test/resources/perf/worldcitiespop.txt.gz")); - final OutputStream output = new FileOutputStream(BIG_FILE)) { + System.out.println("Decompressing test fixture to: " + BIG_FILE + "..."); + try (InputStream input = new GZIPInputStream(PerformanceTest.class.getClassLoader().getResourceAsStream(TEST_RESRC)); + OutputStream output = new FileOutputStream(BIG_FILE)) { IOUtils.copy(input, output); + System.out.println(String.format("Decompressed test fixture %s: %,d bytes.", BIG_FILE, BIG_FILE.length())); } - System.out.println(String.format("Decompressed test fixture %s: %,d bytes.", BIG_FILE, BIG_FILE.length())); } final int argc = args.length; - String tests[]; if (argc > 0) { - max=Integer.parseInt(args[0]); + max = Integer.parseInt(args[0]); } + + final String[] tests; if (argc > 1) { - tests = new String[argc-1]; - for (int i = 1; i < argc; i++) { - tests[i-1]=args[i]; - } + tests = new String[argc - 1]; + System.arraycopy(args, 1, tests, 0, argc - 1); } else { - tests=new String[]{"file", "split", "extb", "exts", "csv", "lexreset", "lexnew"}; + tests = new String[] { "file", "split", "extb", "exts", "csv", "csv-path", "csv-path-db", "csv-url", "lexreset", "lexnew" }; } - for(final String p : PROPS) { - System.out.println(p+"="+System.getProperty(p)); + for (final String p : PROPERTY_NAMES) { + System.out.printf("%s=%s%n", p, System.getProperty(p)); } - System.out.println("Max count: "+max+"\n"); + System.out.printf("Max count: %d%n%n", max); - for(final String test : tests) { - if ("file".equals(test)) { + for (final String test : tests) { + switch (test) { + case "file": testReadBigFile(false); - } else if ("split".equals(test)) { + break; + case "split": testReadBigFile(true); - } else if ("csv".equals(test)) { + break; + case "csv": testParseCommonsCSV(); - } else if ("lexreset".equals(test)) { + break; + case "csv-path": + testParsePath(); + break; + case "csv-path-db": + testParsePathDoubleBuffering(); + break; + case "csv-url": + testParseURL(); + break; + case "lexreset": testCSVLexer(false, test); - } else if ("lexnew".equals(test)) { + break; + case "lexnew": testCSVLexer(true, test); - } else if (test.startsWith("CSVLexer")) { - testCSVLexer(false, test); - } else if ("extb".equals(test)) { - testExtendedBuffer(false); - } else if ("exts".equals(test)) { - testExtendedBuffer(true); - } else { - System.out.println("Invalid test name: "+test); + break; + default: + if (test.startsWith("CSVLexer")) { + testCSVLexer(false, test); + } else if ("extb".equals(test)) { + testExtendedBuffer(false); + } else if ("exts".equals(test)) { + testExtendedBuffer(true); + } else { + System.out.printf("Invalid test name: %s%n", test); + } + break; } } } - private static BufferedReader createReader() throws IOException { - return new BufferedReader(new FileReader(BIG_FILE)); + private static Stats readAll(final BufferedReader in, final boolean split) throws IOException { + int count = 0; + int fields = 0; + String record; + while ((record = in.readLine()) != null) { + count++; + fields += split ? record.split(",").length : 1; + } + return new Stats(count, fields); } - // Container for basic statistics - private static class Stats { - final int count; - final int fields; - Stats(final int c, final int f) { - count=c; - fields=f; + // calculate and show average + private static void show() { + if (num > 1) { + long tot = 0; + for (int i = 1; i < num; i++) { // skip first test + tot += ELAPSED_TIMES[i]; + } + System.out.printf("%-20s: %5dms%n%n", "Average(not first)", tot / (num - 1)); } + num = 0; // ready for next set } // Display end stats; store elapsed for average private static void show(final String msg, final Stats s, final long start) { final long elapsed = System.currentTimeMillis() - start; - System.out.printf("%-20s: %5dms " + s.count + " lines "+ s.fields + " fields%n",msg,elapsed); - elapsedTimes[num++]=elapsed; - } - - // calculate and show average - private static void show(){ - long tot = 0; - if (num > 1) { - for(int i=1; i < num; i++) { // skip first test - tot += elapsedTimes[i]; - } - System.out.printf("%-20s: %5dms%n%n", "Average(not first)", tot/(num-1)); - } - num=0; // ready for next set + System.out.printf("%-20s: %5dms %d lines %d fields%n", msg, elapsed, s.count, s.fields); + ELAPSED_TIMES[num] = elapsed; + num++; } - private static void testReadBigFile(final boolean split) throws Exception { + private static void testCSVLexer(final boolean newToken, final String test) throws Exception { + Token token = new Token(); + String dynamic = ""; for (int i = 0; i < max; i++) { - final long startMillis; + final String simpleName; final Stats stats; - try (final BufferedReader in = createReader()) { + final long startMillis; + try (ExtendedBufferedReader input = new ExtendedBufferedReader(createReader()); + Lexer lexer = createTestCSVLexer(test, input)) { + if (test.startsWith("CSVLexer")) { + dynamic = "!"; + } + simpleName = lexer.getClass().getSimpleName(); + int count = 0; + int fields = 0; startMillis = System.currentTimeMillis(); - stats = readAll(in, split); + do { + if (newToken) { + token = new Token(); + } else { + token.reset(); + } + lexer.nextToken(token); + switch (token.type) { + case EOF: + break; + case EORECORD: + fields++; + count++; + break; + case INVALID: + throw new IOException("invalid parse sequence <" + token.content.toString() + ">"); + case TOKEN: + fields++; + break; + case COMMENT: // not really expecting these + break; + default: + throw new IllegalStateException("Unexpected Token type: " + token.type); + } + } while (!token.type.equals(Token.Type.EOF)); + stats = new Stats(count, fields); } - show(split ? "file+split" : "file", stats, startMillis); + show(simpleName + dynamic + " " + (newToken ? "new" : "reset"), stats, startMillis); } show(); } - private static Stats readAll(final BufferedReader in, final boolean split) throws IOException { - int count = 0; - int fields = 0; - String record; - while ((record=in.readLine()) != null) { - count++; - fields+= split ? record.split(",").length : 1; - } - return new Stats(count, fields); - } - private static void testExtendedBuffer(final boolean makeString) throws Exception { for (int i = 0; i < max; i++) { int fields = 0; int lines = 0; final long startMillis; - try (final ExtendedBufferedReader in = new ExtendedBufferedReader(createReader())) { + try (ExtendedBufferedReader in = new ExtendedBufferedReader(createReader())) { startMillis = System.currentTimeMillis(); int read; if (makeString) { StringBuilder sb = new StringBuilder(); - while ((read = in.read()) != -1) { + while ((read = in.read()) != EOF) { sb.append((char) read); if (read == ',') { // count delimiters sb.toString(); @@ -202,7 +285,7 @@ private static void testExtendedBuffer(final boolean makeString) throws Exceptio } } } else { - while ((read = in.read()) != -1) { + while ((read = in.read()) != EOF) { if (read == ',') { // count delimiters fields++; } else if (read == '\n') { @@ -218,89 +301,45 @@ private static void testExtendedBuffer(final boolean makeString) throws Exceptio } private static void testParseCommonsCSV() throws Exception { + testParser("CSV", () -> CSVParser.builder().setReader(createReader()).setFormat(format).get()); + } + + private static void testParsePath() throws Exception { + testParser("CSV-PATH", () -> CSVParser.parse(Files.newInputStream(Paths.get(BIG_FILE.toURI())), StandardCharsets.ISO_8859_1, format)); + } + + private static void testParsePathDoubleBuffering() throws Exception { + testParser("CSV-PATH-DB", () -> CSVParser.parse(Files.newBufferedReader(Paths.get(BIG_FILE.toURI()), StandardCharsets.ISO_8859_1), format)); + } + + private static void testParser(final String msg, final CSVParserFactory fac) throws Exception { for (int i = 0; i < max; i++) { final long startMillis; final Stats stats; - try (final BufferedReader reader = createReader()) { - try (final CSVParser parser = new CSVParser(reader, format)) { - startMillis = System.currentTimeMillis(); - stats = iterate(parser); - } - show("CSV", stats, startMillis); + try (CSVParser parser = fac.createParser()) { + startMillis = System.currentTimeMillis(); + stats = iterate(parser); } + show(msg, stats, startMillis); } show(); } + private static void testParseURL() throws Exception { + testParser("CSV-URL", () -> CSVParser.parse(BIG_FILE.toURI().toURL(), StandardCharsets.ISO_8859_1, format)); + } - private static Constructor getLexerCtor(final String clazz) throws Exception { - @SuppressWarnings("unchecked") - final Class lexer = (Class) Class.forName("org.apache.commons.csv." + clazz); - return lexer.getConstructor(new Class[]{CSVFormat.class, ExtendedBufferedReader.class}); - } - - private static void testCSVLexer(final boolean newToken, final String test) throws Exception { - Token token = new Token(); - String dynamic = ""; + private static void testReadBigFile(final boolean split) throws Exception { for (int i = 0; i < max; i++) { - final String simpleName; - final Stats stats; final long startMillis; - try (final ExtendedBufferedReader input = new ExtendedBufferedReader(createReader()); - Lexer lexer = createTestCSVLexer(test, input)) { - if (test.startsWith("CSVLexer")) { - dynamic = "!"; - } - simpleName = lexer.getClass().getSimpleName(); - int count = 0; - int fields = 0; + final Stats stats; + try (BufferedReader in = new BufferedReader(createReader())) { startMillis = System.currentTimeMillis(); - do { - if (newToken) { - token = new Token(); - } else { - token.reset(); - } - lexer.nextToken(token); - switch (token.type) { - case EOF: - break; - case EORECORD: - fields++; - count++; - break; - case INVALID: - throw new IOException("invalid parse sequence <" + token.content.toString() + ">"); - case TOKEN: - fields++; - break; - case COMMENT: // not really expecting these - break; - default: - throw new IllegalStateException("Unexpected Token type: " + token.type); - } - } while (!token.type.equals(Token.Type.EOF)); - stats = new Stats(count, fields); + stats = readAll(in, split); } - show(simpleName + dynamic + " " + (newToken ? "new" : "reset"), stats, startMillis); + show(split ? "file+split" : "file", stats, startMillis); } show(); } +} - private static Lexer createTestCSVLexer(final String test, final ExtendedBufferedReader input) - throws InstantiationException, IllegalAccessException, InvocationTargetException, Exception { - return test.startsWith("CSVLexer") ? getLexerCtor(test) - .newInstance(new Object[] { format, input }) : new Lexer(format, input); - } - - private static Stats iterate(final Iterable it) { - int count = 0; - int fields = 0; - for (final CSVRecord record : it) { - count++; - fields+=record.size(); - } - return new Stats(count, fields); - } - -} \ No newline at end of file diff --git a/src/test/java/org/apache/commons/csv/TokenMatchers.java b/src/test/java/org/apache/commons/csv/TokenMatchers.java deleted file mode 100644 index cfe522e821..0000000000 --- a/src/test/java/org/apache/commons/csv/TokenMatchers.java +++ /dev/null @@ -1,89 +0,0 @@ -/* - * Licensed to the Apache Software Foundation (ASF) under one or more - * contributor license agreements. See the NOTICE file distributed with - * this work for additional information regarding copyright ownership. - * The ASF licenses this file to You under the Apache License, Version 2.0 - * (the "License"); you may not use this file except in compliance with - * the License. You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ -package org.apache.commons.csv; - -import static org.hamcrest.core.AllOf.allOf; - -import org.hamcrest.Description; -import org.hamcrest.Matcher; -import org.hamcrest.TypeSafeDiagnosingMatcher; - -/** - * Collection of matchers for asserting the type and content of tokens. - */ -final class TokenMatchers { - - public static Matcher hasType(final Token.Type expectedType) { - return new TypeSafeDiagnosingMatcher() { - - @Override - public void describeTo(final Description description) { - description.appendText("token has type "); - description.appendValue(expectedType); - } - - @Override - protected boolean matchesSafely(final Token item, - final Description mismatchDescription) { - mismatchDescription.appendText("token type is "); - mismatchDescription.appendValue(item.type); - return item.type == expectedType; - } - }; - } - - public static Matcher hasContent(final String expectedContent) { - return new TypeSafeDiagnosingMatcher() { - - @Override - public void describeTo(final Description description) { - description.appendText("token has content "); - description.appendValue(expectedContent); - } - - @Override - protected boolean matchesSafely(final Token item, - final Description mismatchDescription) { - mismatchDescription.appendText("token content is "); - mismatchDescription.appendValue(item.content.toString()); - return expectedContent.equals(item.content.toString()); - } - }; - } - - public static Matcher isReady() { - return new TypeSafeDiagnosingMatcher() { - - @Override - public void describeTo(final Description description) { - description.appendText("token is ready "); - } - - @Override - protected boolean matchesSafely(final Token item, - final Description mismatchDescription) { - mismatchDescription.appendText("token is not ready "); - return item.isReady; - } - }; - } - - public static Matcher matches(final Token.Type expectedType, final String expectedContent) { - return allOf(hasType(expectedType), hasContent(expectedContent)); - } - -} diff --git a/src/test/java/org/apache/commons/csv/TokenMatchersTest.java b/src/test/java/org/apache/commons/csv/TokenMatchersTest.java deleted file mode 100644 index 5a4826b8bc..0000000000 --- a/src/test/java/org/apache/commons/csv/TokenMatchersTest.java +++ /dev/null @@ -1,70 +0,0 @@ -/* - * Licensed to the Apache Software Foundation (ASF) under one or more - * contributor license agreements. See the NOTICE file distributed with - * this work for additional information regarding copyright ownership. - * The ASF licenses this file to You under the Apache License, Version 2.0 - * (the "License"); you may not use this file except in compliance with - * the License. You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ -package org.apache.commons.csv; - -import static org.apache.commons.csv.TokenMatchers.hasContent; -import static org.apache.commons.csv.TokenMatchers.hasType; -import static org.apache.commons.csv.TokenMatchers.isReady; -import static org.apache.commons.csv.TokenMatchers.matches; -import static org.junit.Assert.assertFalse; -import static org.junit.Assert.assertTrue; - -import org.junit.Before; -import org.junit.Test; - -public class TokenMatchersTest { - - private Token token; - - @Before - public void setUp() { - token = new Token(); - token.type = Token.Type.TOKEN; - token.isReady = true; - token.content.append("content"); - } - - @Test - public void testHasType() { - assertFalse(hasType(Token.Type.COMMENT).matches(token)); - assertFalse(hasType(Token.Type.EOF).matches(token)); - assertFalse(hasType(Token.Type.EORECORD).matches(token)); - assertTrue(hasType(Token.Type.TOKEN).matches(token)); - } - - @Test - public void testHasContent() { - assertFalse(hasContent("This is not the token's content").matches(token)); - assertTrue(hasContent("content").matches(token)); - } - - @Test - public void testIsReady() { - assertTrue(isReady().matches(token)); - token.isReady = false; - assertFalse(isReady().matches(token)); - } - - @Test - public void testMatches() { - assertTrue(matches(Token.Type.TOKEN, "content").matches(token)); - assertFalse(matches(Token.Type.EOF, "content").matches(token)); - assertFalse(matches(Token.Type.TOKEN, "not the content").matches(token)); - assertFalse(matches(Token.Type.EORECORD, "not the content").matches(token)); - } - -} diff --git a/src/test/java/org/apache/commons/csv/TokenTest.java b/src/test/java/org/apache/commons/csv/TokenTest.java new file mode 100644 index 0000000000..075c1b1d9c --- /dev/null +++ b/src/test/java/org/apache/commons/csv/TokenTest.java @@ -0,0 +1,50 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * https://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.commons.csv; + +import static org.junit.jupiter.api.Assertions.assertFalse; +import static org.junit.jupiter.api.Assertions.assertTrue; + +import org.junit.jupiter.params.ParameterizedTest; +import org.junit.jupiter.params.provider.EnumSource; + +/** + * Tests {@link Token}. + */ +class TokenTest { + + @ParameterizedTest + @EnumSource(Token.Type.class) + void testToString(final Token.Type type) { + // Should never blow up + final Token token = new Token(); + final String resetName = Token.Type.INVALID.name(); + assertTrue(token.toString().contains(resetName)); + token.reset(); + assertTrue(token.toString().contains(resetName)); + token.type = null; + assertFalse(token.toString().isEmpty()); + token.reset(); + token.type = type; + assertTrue(token.toString().contains(type.name())); + token.content.setLength(1000); + assertTrue(token.toString().contains(type.name())); + } +} diff --git a/src/test/java/org/apache/commons/csv/UserGuideTest.java b/src/test/java/org/apache/commons/csv/UserGuideTest.java new file mode 100644 index 0000000000..6cd8c72d7f --- /dev/null +++ b/src/test/java/org/apache/commons/csv/UserGuideTest.java @@ -0,0 +1,94 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * https://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.commons.csv; + +import static org.junit.jupiter.api.Assertions.assertEquals; + +import java.io.IOException; +import java.io.InputStreamReader; +import java.io.Reader; +import java.io.UnsupportedEncodingException; +import java.nio.charset.StandardCharsets; +import java.nio.file.Files; +import java.nio.file.Path; + +import org.apache.commons.io.input.BOMInputStream; +import org.junit.jupiter.api.Test; +import org.junit.jupiter.api.io.TempDir; + +/** + * Tests for the user guide. + */ +class UserGuideTest { + + @TempDir + Path tempDir; + + /** + * Creates a reader capable of handling BOMs. + * + * @param path The path to read. + * @return a new InputStreamReader for UTF-8 bytes. + * @throws IOException if an I/O error occurs. + */ + public InputStreamReader newReader(final Path path) throws IOException { + return new InputStreamReader(BOMInputStream.builder() + .setPath(path) + .get(), StandardCharsets.UTF_8); + } + + @Test + void testBomFull() throws UnsupportedEncodingException, IOException { + final Path path = tempDir.resolve("test1.csv"); + Files.copy(Utils.createUtf8Input("ColumnA, ColumnB, ColumnC\r\nA, B, C\r\n".getBytes(StandardCharsets.UTF_8), true), path); + // @formatter:off + try (Reader reader = new InputStreamReader(BOMInputStream.builder() + .setPath(path) + .get(), "UTF-8"); + CSVParser parser = CSVFormat.EXCEL.builder() + .setHeader() + .get() + .parse(reader)) { + // @formatter:off + for (final CSVRecord record : parser) { + final String string = record.get("ColumnA"); + assertEquals("A", string); + } + } + } + + @Test + void testBomUtil() throws UnsupportedEncodingException, IOException { + final Path path = tempDir.resolve("test2.csv"); + Files.copy(Utils.createUtf8Input("ColumnA, ColumnB, ColumnC\r\nA, B, C\r\n".getBytes(StandardCharsets.UTF_8), true), path); + try (Reader reader = newReader(path); + // @formatter:off + CSVParser parser = CSVFormat.EXCEL.builder() + .setHeader() + .get() + .parse(reader)) { + // @formatter:off + for (final CSVRecord record : parser) { + final String string = record.get("ColumnA"); + assertEquals("A", string); + } + } + } + +} diff --git a/src/test/java/org/apache/commons/csv/Utils.java b/src/test/java/org/apache/commons/csv/Utils.java index 164a91907e..5b5a05e043 100644 --- a/src/test/java/org/apache/commons/csv/Utils.java +++ b/src/test/java/org/apache/commons/csv/Utils.java @@ -1,48 +1,67 @@ /* - * Licensed to the Apache Software Foundation (ASF) under one or more - * contributor license agreements. See the NOTICE file distributed with - * this work for additional information regarding copyright ownership. - * The ASF licenses this file to You under the Apache License, Version 2.0 - * (the "License"); you may not use this file except in compliance with - * the License. You may obtain a copy of the License at + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. + * https://www.apache.org/licenses/LICENSE-2.0 * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. */ package org.apache.commons.csv; -import java.util.List; +import static org.junit.jupiter.api.Assertions.assertArrayEquals; +import static org.junit.jupiter.api.Assertions.assertEquals; -import org.junit.Assert; +import java.io.ByteArrayInputStream; +import java.io.InputStream; +import java.util.List; /** * Utility methods for test cases - * - * @version $Id$ */ final class Utils { - private Utils() { - } - /** * Checks if the 2d array has the same contents as the list of records. * - * @param message the message to be displayed + * @param message the message to be displayed * @param expected the 2d array of expected results - * @param actual the List of {@link CSVRecord} entries, each containing an array of values + * @param actual the List of {@link CSVRecord} entries, each containing an array of values + * @param maxRows the maximum number of rows expected, less than or equal to zero means no limit. */ - public static void compare(final String message, final String[][] expected, final List actual) { - Assert.assertEquals(message+" - outer array size", expected.length, actual.size()); - for (int i = 0; i < expected.length; i++) { - Assert.assertArrayEquals(message + " (entry " + i + ")", expected[i], actual.get(i).values()); + public static void compare(final String message, final String[][] expected, final List actual, final long maxRows) { + final long expectedLength = maxRows > 0 ? Math.min(maxRows, expected.length) : expected.length; + assertEquals(expectedLength, actual.size(), message + " - outer array size"); + for (int i = 0; i < expectedLength; i++) { + assertArrayEquals(expected[i], actual.get(i).values(), message + " (entry " + i + ")"); } } + + /** + * Creates an input stream, with or without a BOM. + */ + static InputStream createUtf8Input(final byte[] baseData, final boolean addBom) { + byte[] data = baseData; + if (addBom) { + data = new byte[baseData.length + 3]; + data[0] = (byte) 0xEF; + data[1] = (byte) 0xBB; + data[2] = (byte) 0xBF; + System.arraycopy(baseData, 0, data, 3, baseData.length); + } + return new ByteArrayInputStream(data); + } + + private Utils() { + } } diff --git a/src/test/java/org/apache/commons/csv/issues/JiraCsv148Test.java b/src/test/java/org/apache/commons/csv/issues/JiraCsv148Test.java new file mode 100644 index 0000000000..67f1b785d5 --- /dev/null +++ b/src/test/java/org/apache/commons/csv/issues/JiraCsv148Test.java @@ -0,0 +1,63 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * https://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.commons.csv.issues; + +import static org.junit.jupiter.api.Assertions.assertEquals; + +import org.apache.commons.csv.CSVFormat; +import org.apache.commons.csv.QuoteMode; +import org.junit.jupiter.api.Test; + +class JiraCsv148Test { + + @Test + void testWithIgnoreSurroundingSpacesEmpty() { + // @formatter:off + final CSVFormat format = CSVFormat.DEFAULT.builder() + .setQuoteMode(QuoteMode.ALL) + .setIgnoreSurroundingSpaces(true) + .get(); + // @formatter:on + assertEquals( + "\"\",\" \",\" Single space on the left\",\"Single space on the right \"," + + "\" Single spaces on both sides \",\" Multiple spaces on the left\"," + + "\"Multiple spaces on the right \",\" Multiple spaces on both sides \"", + format.format("", " ", " Single space on the left", "Single space on the right ", " Single spaces on both sides ", + " Multiple spaces on the left", "Multiple spaces on the right ", " Multiple spaces on both sides ")); + } + + /** + * The difference between withTrim()and withIgnoreSurroundingSpace()： difference: withTrim() can remove the leading and trailing spaces and newlines in + * quotation marks, while withIgnoreSurroundingSpace() cannot The same point: you can remove the leading and trailing spaces, tabs and other symbols. + */ + @Test + void testWithTrimEmpty() { + // @formatter:off + final CSVFormat format = CSVFormat.DEFAULT.builder() + .setQuoteMode(QuoteMode.ALL) + .setTrim(true) + .get(); + // @formatter:on + assertEquals( + "\"\",\"\",\"Single space on the left\",\"Single space on the right\",\"Single spaces on both sides\",\"Multiple spaces on the left\"," + + "\"Multiple spaces on the right\",\"Multiple spaces on both sides\"", + format.format("", " ", " Single space on the left", "Single space on the right ", " Single spaces on both sides ", + " Multiple spaces on the left", "Multiple spaces on the right ", " Multiple spaces on both sides ")); + } +} diff --git a/src/test/java/org/apache/commons/csv/issues/JiraCsv149Test.java b/src/test/java/org/apache/commons/csv/issues/JiraCsv149Test.java new file mode 100644 index 0000000000..b32e965665 --- /dev/null +++ b/src/test/java/org/apache/commons/csv/issues/JiraCsv149Test.java @@ -0,0 +1,67 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * https://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.commons.csv.issues; + +import static org.junit.jupiter.api.Assertions.assertEquals; +import static org.junit.jupiter.api.Assertions.assertNotNull; + +import java.io.IOException; +import java.io.StringReader; + +import org.apache.commons.csv.CSVFormat; +import org.apache.commons.csv.CSVParser; +import org.apache.commons.csv.CSVRecord; +import org.junit.jupiter.api.Test; + +class JiraCsv149Test { + + private static final String CR_LF = "\r\n"; + + @Test + void testJiraCsv149EndWithEOL() throws IOException { + testJiraCsv149EndWithEolAtEof(true); + } + + private void testJiraCsv149EndWithEolAtEof(final boolean eolAtEof) throws IOException { + String source = "A,B,C,D" + CR_LF + "a1,b1,c1,d1" + CR_LF + "a2,b2,c2,d2"; + if (eolAtEof) { + source += CR_LF; + } + final StringReader reader = new StringReader(source); + // @formatter:off + final CSVFormat format = CSVFormat.RFC4180.builder() + .setHeader() + .setSkipHeaderRecord(true) + .setQuote('"') + .get(); + // @formatter:on + int lineCounter = 2; + try (CSVParser parser = CSVParser.builder().setReader(reader).setFormat(format).get()) { + for (final CSVRecord record : parser) { + assertNotNull(record); + assertEquals(lineCounter++, parser.getCurrentLineNumber()); + } + } + } + + @Test + void testJiraCsv149EndWithoutEOL() throws IOException { + testJiraCsv149EndWithEolAtEof(false); + } +} diff --git a/src/test/java/org/apache/commons/csv/issues/JiraCsv150Test.java b/src/test/java/org/apache/commons/csv/issues/JiraCsv150Test.java new file mode 100644 index 0000000000..eec91d52d0 --- /dev/null +++ b/src/test/java/org/apache/commons/csv/issues/JiraCsv150Test.java @@ -0,0 +1,55 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * https://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.commons.csv.issues; + +import static org.junit.jupiter.api.Assertions.assertEquals; + +import java.io.IOException; +import java.io.StringReader; + +import org.apache.commons.csv.CSVFormat; +import org.apache.commons.csv.CSVParser; +import org.junit.jupiter.api.Test; + +class JiraCsv150Test { + + private void testDisable(final CSVFormat format, final StringReader reader) throws IOException { + try (CSVParser csvParser = CSVParser.builder().setReader(reader).setFormat(format).get()) { + assertEquals(1, csvParser.getRecords().size()); + } + } + + @Test + void testDisableComment() throws IOException { + final StringReader stringReader = new StringReader("\"66\u2441\",,\"\",\"DeutscheBK\ufffe\",\"000\"\r\n"); + testDisable(CSVFormat.DEFAULT.builder().setCommentMarker(null).get(), stringReader); + } + + @Test + void testDisableEncapsulation() throws IOException { + final StringReader stringReader = new StringReader("66\u2441,,\"\",\ufffeDeutscheBK,\"000\"\r\n"); + testDisable(CSVFormat.DEFAULT.builder().setQuote(null).get(), stringReader); + } + + @Test + void testDisableEscaping() throws IOException { + final StringReader stringReader = new StringReader("\"66\u2441\",,\"\",\"DeutscheBK\ufffe\",\"000\"\r\n"); + testDisable(CSVFormat.DEFAULT.builder().setEscape(null).get(), stringReader); + } +} diff --git a/src/test/java/org/apache/commons/csv/issues/JiraCsv154Test.java b/src/test/java/org/apache/commons/csv/issues/JiraCsv154Test.java new file mode 100644 index 0000000000..90d657fcd1 --- /dev/null +++ b/src/test/java/org/apache/commons/csv/issues/JiraCsv154Test.java @@ -0,0 +1,69 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * https://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.commons.csv.issues; + +import static org.junit.jupiter.api.Assertions.assertTrue; + +import java.io.IOException; + +import org.apache.commons.csv.CSVFormat; +import org.apache.commons.csv.CSVPrinter; +import org.junit.jupiter.api.Test; + +class JiraCsv154Test { + + @Test + void testJiraCsv154_withCommentMarker() throws IOException { + final String comment = "This is a header comment"; + // @formatter:off + final CSVFormat format = CSVFormat.EXCEL.builder() + .setHeader("H1", "H2") + .setCommentMarker('#') + .setHeaderComments(comment) + .get(); + // @formatter:on + final StringBuilder out = new StringBuilder(); + try (CSVPrinter printer = format.print(out)) { + printer.print("A"); + printer.print("B"); + } + final String s = out.toString(); + assertTrue(s.contains(comment), s); + } + + @Test + void testJiraCsv154_withHeaderComments() throws IOException { + final String comment = "This is a header comment"; + // @formatter:off + final CSVFormat format = CSVFormat.EXCEL.builder() + .setHeader("H1", "H2") + .setHeaderComments(comment) + .setCommentMarker('#') + .get(); + // @formatter:on + final StringBuilder out = new StringBuilder(); + try (CSVPrinter printer = format.print(out)) { + printer.print("A"); + printer.print("B"); + } + final String s = out.toString(); + assertTrue(s.contains(comment), s); + } + +} diff --git a/src/test/java/org/apache/commons/csv/issues/JiraCsv164Test.java b/src/test/java/org/apache/commons/csv/issues/JiraCsv164Test.java deleted file mode 100644 index 07e8f94b1e..0000000000 --- a/src/test/java/org/apache/commons/csv/issues/JiraCsv164Test.java +++ /dev/null @@ -1,57 +0,0 @@ -/* - * Licensed to the Apache Software Foundation (ASF) under one or more - * contributor license agreements. See the NOTICE file distributed with - * this work for additional information regarding copyright ownership. - * The ASF licenses this file to You under the Apache License, Version 2.0 - * (the "License"); you may not use this file except in compliance with - * the License. You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ -package org.apache.commons.csv.issues; - -import static org.junit.Assert.assertTrue; - -import java.io.IOException; - -import org.apache.commons.csv.CSVFormat; -import org.apache.commons.csv.CSVPrinter; -import org.junit.Test; - -public class JiraCsv164Test { - - @Test - public void testJiraCsv154_withCommentMarker() throws IOException { - final String comment = "This is a header comment"; - final CSVFormat format = CSVFormat.EXCEL.withHeader("H1", "H2").withCommentMarker('#') - .withHeaderComments(comment); - final StringBuilder out = new StringBuilder(); - try (final CSVPrinter printer = format.print(out)) { - printer.print("A"); - printer.print("B"); - } - final String s = out.toString(); - assertTrue(s, s.contains(comment)); - } - - @Test - public void testJiraCsv154_withHeaderComments() throws IOException { - final String comment = "This is a header comment"; - final CSVFormat format = CSVFormat.EXCEL.withHeader("H1", "H2").withHeaderComments(comment) - .withCommentMarker('#'); - final StringBuilder out = new StringBuilder(); - try (final CSVPrinter printer = format.print(out)) { - printer.print("A"); - printer.print("B"); - } - final String s = out.toString(); - assertTrue(s, s.contains(comment)); - } - -} diff --git a/src/test/java/org/apache/commons/csv/issues/JiraCsv167Test.java b/src/test/java/org/apache/commons/csv/issues/JiraCsv167Test.java index d4c41aaf58..607d0cf2a3 100644 --- a/src/test/java/org/apache/commons/csv/issues/JiraCsv167Test.java +++ b/src/test/java/org/apache/commons/csv/issues/JiraCsv167Test.java @@ -1,24 +1,27 @@ /* - * Licensed to the Apache Software Foundation (ASF) under one or more - * contributor license agreements. See the NOTICE file distributed with - * this work for additional information regarding copyright ownership. - * The ASF licenses this file to You under the Apache License, Version 2.0 - * (the "License"); you may not use this file except in compliance with - * the License. You may obtain a copy of the License at + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at * - * http://www.apache.org/licenses/LICENSE-2.0 + * https://www.apache.org/licenses/LICENSE-2.0 * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. */ package org.apache.commons.csv.issues; +import static org.junit.jupiter.api.Assertions.assertEquals; + import java.io.BufferedReader; import java.io.IOException; -import java.io.InputStream; import java.io.InputStreamReader; import java.io.Reader; @@ -26,16 +29,20 @@ import org.apache.commons.csv.CSVParser; import org.apache.commons.csv.CSVRecord; import org.apache.commons.csv.QuoteMode; -import org.junit.Assert; -import org.junit.Test; +import org.junit.jupiter.api.Test; + +class JiraCsv167Test { -public class JiraCsv167Test { + private Reader getTestReader() { + return new InputStreamReader( + ClassLoader.getSystemClassLoader().getResourceAsStream("org/apache/commons/csv/csv-167/sample1.csv")); + } @Test - public void parse() throws IOException { + void testParse() throws IOException { int totcomment = 0; int totrecs = 0; - try (final BufferedReader br = new BufferedReader(getTestInput())) { + try (Reader reader = getTestReader(); BufferedReader br = new BufferedReader(reader)) { String s = null; boolean lastWasComment = false; while ((s = br.readLine()) != null) { @@ -50,25 +57,26 @@ public void parse() throws IOException { } } } - CSVFormat format = CSVFormat.DEFAULT; - // - format = format.withAllowMissingColumnNames(false); - format = format.withCommentMarker('#'); - format = format.withDelimiter(','); - format = format.withEscape('\\'); - format = format.withHeader("author", "title", "publishDate"); - format = format.withHeaderComments("headerComment"); - format = format.withNullString("NULL"); - format = format.withIgnoreEmptyLines(true); - format = format.withIgnoreSurroundingSpaces(true); - format = format.withQuote('"'); - format = format.withQuoteMode(QuoteMode.ALL); - format = format.withRecordSeparator('\n'); - format = format.withSkipHeaderRecord(false); - // + final CSVFormat format = CSVFormat.DEFAULT.builder() + // @formatter:off + .setAllowMissingColumnNames(false) + .setCommentMarker('#') + .setDelimiter(',') + .setEscape('\\') + .setHeader("author", "title", "publishDate") + .setHeaderComments("headerComment") + .setNullString("NULL") + .setIgnoreEmptyLines(true) + .setIgnoreSurroundingSpaces(true) + .setQuote('"') + .setQuoteMode(QuoteMode.ALL) + .setRecordSeparator('\n') + .setSkipHeaderRecord(false) + .get(); + // @formatter:on int comments = 0; int records = 0; - try (final CSVParser parser = format.parse(getTestInput())) { + try (Reader reader = getTestReader(); CSVParser parser = format.parse(reader)) { for (final CSVRecord csvRecord : parser) { records++; if (csvRecord.hasComment()) { @@ -77,12 +85,7 @@ public void parse() throws IOException { } } // Comment lines are concatenated, in this example 4 lines become 2 comments. - Assert.assertEquals(totcomment, comments); - Assert.assertEquals(totrecs, records); // records includes the header - } - - private Reader getTestInput() { - final InputStream is = ClassLoader.getSystemClassLoader().getResourceAsStream("csv-167/sample1.csv"); - return new InputStreamReader(is); + assertEquals(totcomment, comments); + assertEquals(totrecs, records); // records includes the header } } diff --git a/src/test/java/org/apache/commons/csv/issues/JiraCsv198Test.java b/src/test/java/org/apache/commons/csv/issues/JiraCsv198Test.java index c0c38b7fa6..1117c12ac9 100644 --- a/src/test/java/org/apache/commons/csv/issues/JiraCsv198Test.java +++ b/src/test/java/org/apache/commons/csv/issues/JiraCsv198Test.java @@ -1,47 +1,54 @@ /* - * Licensed to the Apache Software Foundation (ASF) under one or more - * contributor license agreements. See the NOTICE file distributed with - * this work for additional information regarding copyright ownership. - * The ASF licenses this file to You under the Apache License, Version 2.0 - * (the "License"); you may not use this file except in compliance with - * the License. You may obtain a copy of the License at + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at * - * http://www.apache.org/licenses/LICENSE-2.0 + * https://www.apache.org/licenses/LICENSE-2.0 * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. */ + package org.apache.commons.csv.issues; +import static org.junit.jupiter.api.Assertions.assertNotNull; + import java.io.IOException; import java.io.InputStream; import java.io.InputStreamReader; import java.io.UnsupportedEncodingException; +import java.nio.charset.StandardCharsets; import org.apache.commons.csv.CSVFormat; import org.apache.commons.csv.CSVParser; -import org.apache.commons.csv.CSVRecord; -import org.junit.Assert; -import org.junit.Test; +import org.junit.jupiter.api.Test; -public class JiraCsv198Test { +class JiraCsv198Test { - private static final CSVFormat CSV_FORMAT = CSVFormat.EXCEL.withDelimiter('^').withFirstRecordAsHeader(); + // @formatter:off + private static final CSVFormat CSV_FORMAT = CSVFormat.EXCEL.builder() + .setDelimiter('^') + .setHeader() + .setSkipHeaderRecord(true) + .get(); + // @formatter:on @Test - public void test() throws UnsupportedEncodingException, IOException { - InputStream pointsOfReference = getClass().getResourceAsStream("/CSV-198/optd_por_public.csv"); - Assert.assertNotNull(pointsOfReference); + void test() throws UnsupportedEncodingException, IOException { + final InputStream pointsOfReference = getClass().getResourceAsStream("/org/apache/commons/csv/CSV-198/optd_por_public.csv"); + assertNotNull(pointsOfReference); try (@SuppressWarnings("resource") - CSVParser parser = CSV_FORMAT.parse(new InputStreamReader(pointsOfReference, "UTF-8"))) { - for (CSVRecord record : parser) { - String locationType = record.get("location_type"); - Assert.assertNotNull(locationType); - } + CSVParser parser = CSV_FORMAT.parse(new InputStreamReader(pointsOfReference, StandardCharsets.UTF_8))) { + parser.forEach(record -> assertNotNull(record.get("location_type"))); } } -} \ No newline at end of file +} diff --git a/src/test/java/org/apache/commons/csv/issues/JiraCsv203Test.java b/src/test/java/org/apache/commons/csv/issues/JiraCsv203Test.java new file mode 100644 index 0000000000..2c9226506c --- /dev/null +++ b/src/test/java/org/apache/commons/csv/issues/JiraCsv203Test.java @@ -0,0 +1,145 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * https://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.commons.csv.issues; + +import static org.junit.jupiter.api.Assertions.assertEquals; + +import org.apache.commons.csv.CSVFormat; +import org.apache.commons.csv.CSVPrinter; +import org.apache.commons.csv.QuoteMode; +import org.junit.jupiter.api.Test; + +/** + * JIRA: withNullString value is printed without quotes when + * QuoteMode.ALL is specified + */ +class JiraCsv203Test { + + @Test + void testQuoteModeAll() throws Exception { + // @formatter:off + final CSVFormat format = CSVFormat.EXCEL.builder() + .setNullString("N/A") + .setIgnoreSurroundingSpaces(true) + .setQuoteMode(QuoteMode.ALL) + .get(); + // @formatter:on + final StringBuilder buffer = new StringBuilder(); + try (CSVPrinter printer = new CSVPrinter(buffer, format)) { + printer.printRecord(null, "Hello", null, "World"); + } + assertEquals("\"N/A\",\"Hello\",\"N/A\",\"World\"\r\n", buffer.toString()); + } + + @Test + void testQuoteModeAllNonNull() throws Exception { + // @formatter:off + final CSVFormat format = CSVFormat.EXCEL.builder() + .setNullString("N/A") + .setIgnoreSurroundingSpaces(true) + .setQuoteMode(QuoteMode.ALL_NON_NULL) + .get(); + // @formatter:on + final StringBuilder buffer = new StringBuilder(); + try (CSVPrinter printer = new CSVPrinter(buffer, format)) { + printer.printRecord(null, "Hello", null, "World"); + } + assertEquals("N/A,\"Hello\",N/A,\"World\"\r\n", buffer.toString()); + } + + @Test + void testQuoteModeMinimal() throws Exception { + // @formatter:off + final CSVFormat format = CSVFormat.EXCEL.builder() + .setNullString("N/A") + .setIgnoreSurroundingSpaces(true) + .setQuoteMode(QuoteMode.MINIMAL) + .get(); + // @formatter:on + final StringBuilder buffer = new StringBuilder(); + try (CSVPrinter printer = new CSVPrinter(buffer, format)) { + printer.printRecord(null, "Hello", null, "World"); + } + assertEquals("N/A,Hello,N/A,World\r\n", buffer.toString()); + } + + @Test + void testQuoteModeNonNumeric() throws Exception { + // @formatter:off + final CSVFormat format = CSVFormat.EXCEL.builder() + .setNullString("N/A") + .setIgnoreSurroundingSpaces(true) + .setQuoteMode(QuoteMode.NON_NUMERIC) + .get(); + // @formatter:on + final StringBuilder buffer = new StringBuilder(); + try (CSVPrinter printer = new CSVPrinter(buffer, format)) { + printer.printRecord(null, "Hello", null, "World"); + } + assertEquals("N/A,\"Hello\",N/A,\"World\"\r\n", buffer.toString()); + } + + @Test + void testWithEmptyValues() throws Exception { + // @formatter:off + final CSVFormat format = CSVFormat.EXCEL.builder() + .setNullString("N/A") + .setIgnoreSurroundingSpaces(true) + .setQuoteMode(QuoteMode.ALL) + .get(); + // @formatter:on + final StringBuilder buffer = new StringBuilder(); + try (CSVPrinter printer = new CSVPrinter(buffer, format)) { + printer.printRecord("", "Hello", "", "World"); + // printer.printRecord(new Object[] { null, "Hello", null, "World" }); + } + assertEquals("\"\",\"Hello\",\"\",\"World\"\r\n", buffer.toString()); + } + + @Test + void testWithoutNullString() throws Exception { + // @formatter:off + final CSVFormat format = CSVFormat.EXCEL.builder() + //.setNullString("N/A") + .setIgnoreSurroundingSpaces(true) + .setQuoteMode(QuoteMode.ALL) + .get(); + // @formatter:on + final StringBuilder buffer = new StringBuilder(); + try (CSVPrinter printer = new CSVPrinter(buffer, format)) { + printer.printRecord(null, "Hello", null, "World"); + } + assertEquals(",\"Hello\",,\"World\"\r\n", buffer.toString()); + } + + @Test + void testWithoutQuoteMode() throws Exception { + // @formatter:off + final CSVFormat format = CSVFormat.EXCEL.builder() + .setNullString("N/A") + .setIgnoreSurroundingSpaces(true) + .get(); + // @formatter:on + final StringBuilder buffer = new StringBuilder(); + try (CSVPrinter printer = new CSVPrinter(buffer, format)) { + printer.printRecord(null, "Hello", null, "World"); + } + assertEquals("N/A,Hello,N/A,World\r\n", buffer.toString()); + } +} diff --git a/src/test/java/org/apache/commons/csv/issues/JiraCsv206Test.java b/src/test/java/org/apache/commons/csv/issues/JiraCsv206Test.java new file mode 100644 index 0000000000..2fecd10f16 --- /dev/null +++ b/src/test/java/org/apache/commons/csv/issues/JiraCsv206Test.java @@ -0,0 +1,76 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * https://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.commons.csv.issues; + +import static org.junit.jupiter.api.Assertions.assertEquals; + +import java.io.IOException; +import java.io.StringReader; +import java.util.Iterator; + +import org.apache.commons.csv.CSVFormat; +import org.apache.commons.csv.CSVParser; +import org.apache.commons.csv.CSVPrinter; +import org.apache.commons.csv.CSVRecord; +import org.junit.jupiter.api.Test; + +class JiraCsv206Test { + + @Test + void testJiraCsv206MultipleCharacterDelimiter() throws IOException { + // Read with multiple character delimiter + final String source = "FirstName[|]LastName[|]Address\r\nJohn[|]Smith[|]123 Main St."; + final StringReader reader = new StringReader(source); + final CSVFormat format = CSVFormat.DEFAULT.builder().setDelimiter("[|]").get(); + CSVRecord record = null; + try (CSVParser csvParser = CSVParser.builder().setReader(reader).setFormat(format).get()) { + final Iterator iterator = csvParser.iterator(); + record = iterator.next(); + assertEquals("FirstName", record.get(0)); + assertEquals("LastName", record.get(1)); + assertEquals("Address", record.get(2)); + record = iterator.next(); + assertEquals("John", record.get(0)); + assertEquals("Smith", record.get(1)); + assertEquals("123 Main St.", record.get(2)); + } + // Write with multiple character delimiter + // @formatter:off + final String outString = + "# Change delimiter to [I]\r\n" + + "first name[I]last name[I]address\r\n" + + "John[I]Smith[I]123 Main St."; + // @formatter:on + final String comment = "Change delimiter to [I]"; + // @formatter:off + final CSVFormat formatExcel = CSVFormat.EXCEL.builder() + .setDelimiter("[I]").setHeader("first name", "last name", "address") + .setCommentMarker('#') + .setHeaderComments(comment).get(); + // @formatter:on + final StringBuilder out = new StringBuilder(); + try (CSVPrinter printer = formatExcel.print(out)) { + printer.print(record.get(0)); + printer.print(record.get(1)); + printer.print(record.get(2)); + } + final String s = out.toString(); + assertEquals(outString, s); + } +} diff --git a/src/test/java/org/apache/commons/csv/issues/JiraCsv211Test.java b/src/test/java/org/apache/commons/csv/issues/JiraCsv211Test.java new file mode 100644 index 0000000000..28b559d1e1 --- /dev/null +++ b/src/test/java/org/apache/commons/csv/issues/JiraCsv211Test.java @@ -0,0 +1,54 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * https://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.commons.csv.issues; + +import static org.junit.jupiter.api.Assertions.assertEquals; + +import java.io.IOException; +import java.io.StringReader; + +import org.apache.commons.csv.CSVFormat; +import org.apache.commons.csv.CSVParser; +import org.junit.jupiter.api.Test; + +class JiraCsv211Test { + + @Test + void testJiraCsv211Format() throws IOException { + // @formatter:off + final CSVFormat printFormat = CSVFormat.DEFAULT.builder() + .setDelimiter('\t') + .setHeader("ID", "Name", "Country", "Age") + .get(); + // @formatter:on + final String formatted = printFormat.format("1", "Jane Doe", "USA", ""); + assertEquals("ID\tName\tCountry\tAge\r\n1\tJane Doe\tUSA\t", formatted); + + final CSVFormat parseFormat = CSVFormat.DEFAULT.builder().setDelimiter('\t').setHeader().setSkipHeaderRecord(true).get(); + try (CSVParser parser = parseFormat.parse(new StringReader(formatted))) { + parser.forEach(record -> { + assertEquals("1", record.get(0)); + assertEquals("Jane Doe", record.get(1)); + assertEquals("USA", record.get(2)); + assertEquals("", record.get(3)); + }); + } + } +} diff --git a/src/test/java/org/apache/commons/csv/issues/JiraCsv213Test.java b/src/test/java/org/apache/commons/csv/issues/JiraCsv213Test.java new file mode 100644 index 0000000000..90f5da4c5a --- /dev/null +++ b/src/test/java/org/apache/commons/csv/issues/JiraCsv213Test.java @@ -0,0 +1,70 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * https://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.commons.csv.issues; + +import java.io.File; +import java.io.IOException; +import java.io.Reader; +import java.nio.charset.StandardCharsets; +import java.nio.file.Files; + +import org.apache.commons.csv.CSVFormat; +import org.apache.commons.csv.CSVParser; +import org.apache.commons.csv.QuoteMode; +import org.junit.jupiter.api.Test; + +/** + * Tests https://issues.apache.org/jira/browse/CSV-213 + *

+ * This is normal behavior with the current architecture: The iterator() API presents an object that is backed by data + * in the CSVParser as the parser is streaming over the file. The CSVParser is like a forward-only stream. When you + * create a new Iterator you are only created a new view on the same position in the parser's stream. For the behavior + * you want, you need to open a new CSVParser. + *

+ */ +class JiraCsv213Test { + + private void createEndChannel(final File csvFile) { + // @formatter:off + final CSVFormat csvFormat = CSVFormat.DEFAULT.builder() + .setDelimiter(';') + .setHeader() + .setSkipHeaderRecord(true) + .setRecordSeparator('\n') + .setQuoteMode(QuoteMode.ALL) + .get(); + // @formatter:on + try (Reader reader = Files.newBufferedReader(csvFile.toPath(), StandardCharsets.UTF_8); + CSVParser parser = csvFormat.parse(reader)) { + if (parser.iterator().hasNext()) { + // System.out.println(parser.getCurrentLineNumber()); + // System.out.println(parser.getRecordNumber()); + // get only first record we don't need other's + parser.iterator().next(); // this fails + } + } catch (final IOException e) { + throw new IllegalStateException("Error while adding end channel to CSV", e); + } + } + + @Test + void test() { + createEndChannel(new File("src/test/resources/org/apache/commons/csv/CSV-213/999751170.patch.csv")); + } +} diff --git a/src/test/java/org/apache/commons/csv/issues/JiraCsv227Test.java b/src/test/java/org/apache/commons/csv/issues/JiraCsv227Test.java new file mode 100644 index 0000000000..2b9e335a8f --- /dev/null +++ b/src/test/java/org/apache/commons/csv/issues/JiraCsv227Test.java @@ -0,0 +1,49 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * https://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.commons.csv.issues; + +import static org.junit.jupiter.api.Assertions.assertEquals; + +import java.io.IOException; + +import org.apache.commons.csv.CSVFormat; +import org.apache.commons.csv.CSVPrinter; +import org.apache.commons.csv.QuoteMode; +import org.junit.jupiter.api.Test; + +/** + * Tests https://issues.apache.org/jira/browse/CSV-227 + */ +class JiraCsv227Test { + + @Test + public void test() throws IOException { + final StringBuilder out = new StringBuilder(); + try (CSVPrinter printer = new CSVPrinter(out, CSVFormat.DEFAULT.withQuoteMode(QuoteMode.MINIMAL))) { + printer.printRecord("ㅁㅎㄷㄹ", "ㅁㅎㄷㄹ", "", "test2"); + printer.printRecord("한글3", "hello3", "3한글3", "test3"); + printer.printRecord("", "hello4", "", "test4"); + } + // ㅁㅎㄷㄹ,ㅁㅎㄷㄹ,,test2 + // 한글3,hello3,3한글3,test3 + // "",hello4,,test4 + assertEquals("ㅁㅎㄷㄹ,ㅁㅎㄷㄹ,,test2\r\n한글3,hello3,3한글3,test3\r\n\"\",hello4,,test4\r\n", out.toString()); + } +} diff --git a/src/test/java/org/apache/commons/csv/issues/JiraCsv247Test.java b/src/test/java/org/apache/commons/csv/issues/JiraCsv247Test.java new file mode 100644 index 0000000000..c2d9ac5910 --- /dev/null +++ b/src/test/java/org/apache/commons/csv/issues/JiraCsv247Test.java @@ -0,0 +1,85 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * https://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.commons.csv.issues; + +import static org.junit.jupiter.api.Assertions.assertEquals; +import static org.junit.jupiter.api.Assertions.assertFalse; +import static org.junit.jupiter.api.Assertions.assertThrows; +import static org.junit.jupiter.api.Assertions.assertTrue; + +import java.io.Reader; +import java.io.StringReader; +import java.util.Arrays; +import java.util.Iterator; + +import org.apache.commons.csv.CSVFormat; +import org.apache.commons.csv.CSVParser; +import org.apache.commons.csv.CSVRecord; +import org.junit.jupiter.api.Test; + +class JiraCsv247Test { + + @Test + void testHeadersMissingOneColumnWhenAllowingMissingColumnNames() throws Exception { + final CSVFormat format = CSVFormat.DEFAULT.builder().setHeader().setAllowMissingColumnNames(true).get(); + + assertTrue(format.getAllowMissingColumnNames(), "We should allow missing column names"); + + final Reader in = new StringReader("a,,c,d,e\n1,2,3,4,5\nv,w,x,y,z"); + try (CSVParser parser = format.parse(in)) { + assertEquals(Arrays.asList("a", "", "c", "d", "e"), parser.getHeaderNames()); + final Iterator iterator = parser.iterator(); + CSVRecord record = iterator.next(); + assertEquals("1", record.get(0)); + assertEquals("2", record.get(1)); + assertEquals("3", record.get(2)); + assertEquals("4", record.get(3)); + assertEquals("5", record.get(4)); + record = iterator.next(); + assertEquals("v", record.get(0)); + assertEquals("w", record.get(1)); + assertEquals("x", record.get(2)); + assertEquals("y", record.get(3)); + assertEquals("z", record.get(4)); + assertFalse(iterator.hasNext()); + } + } + + @Test + void testHeadersMissingThrowsWhenNotAllowingMissingColumnNames() { + final CSVFormat format = CSVFormat.DEFAULT.builder().setHeader().get(); + + assertFalse(format.getAllowMissingColumnNames(), "By default we should not allow missing column names"); + + assertThrows(IllegalArgumentException.class, () -> { + try (Reader reader = new StringReader("a,,c,d,e\n1,2,3,4,5\nv,w,x,y,z"); + CSVParser parser = format.parse(reader);) { + // should fail + } + }, "1 missing column header is not allowed"); + + assertThrows(IllegalArgumentException.class, () -> { + try (Reader reader = new StringReader("a,,c,d,\n1,2,3,4,5\nv,w,x,y,z"); + CSVParser parser = format.parse(reader);) { + // should fail + } + }, "2+ missing column headers is not allowed!"); + } +} diff --git a/src/test/java/org/apache/commons/csv/issues/JiraCsv248Test.java b/src/test/java/org/apache/commons/csv/issues/JiraCsv248Test.java new file mode 100644 index 0000000000..480a9dffa9 --- /dev/null +++ b/src/test/java/org/apache/commons/csv/issues/JiraCsv248Test.java @@ -0,0 +1,81 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * https://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.commons.csv.issues; + +import static org.junit.jupiter.api.Assertions.assertEquals; +import static org.junit.jupiter.api.Assertions.assertFalse; +import static org.junit.jupiter.api.Assertions.assertInstanceOf; +import static org.junit.jupiter.api.Assertions.assertNull; +import static org.junit.jupiter.api.Assertions.assertThrows; +import static org.junit.jupiter.api.Assertions.assertTrue; + +import java.io.IOException; +import java.io.InputStream; +import java.io.ObjectInputStream; + +import org.apache.commons.csv.CSVRecord; +import org.junit.jupiter.api.Test; + +class JiraCsv248Test { + + private static InputStream getTestInput() { + return ClassLoader.getSystemClassLoader().getResourceAsStream("org/apache/commons/csv/CSV-248/csvRecord.bin"); + } + + /** + * Test deserialization of a CSVRecord created using version 1.6. + * + *

+ * This test asserts that serialization from 1.8 onwards is consistent with previous versions. Serialization was + * broken in version 1.7. + * + * @throws IOException Signals that an I/O exception has occurred. + * @throws ClassNotFoundException If the CSVRecord cannot be deserialized + */ + @Test + void testJiraCsv248() throws IOException, ClassNotFoundException { + // Record was originally created using CSV version 1.6 with the following code: + // try (CSVParser parser = CSVParser.parse("A,B\n#my comment\nOne,Two", + // CSVFormat.DEFAULT.builder().setHeader().setCommentMarker('#'))) { + // CSVRecord rec = parser.iterator().next(); + // } + try (InputStream in = getTestInput(); ObjectInputStream ois = new ObjectInputStream(in)) { + final Object object = ois.readObject(); + assertInstanceOf(CSVRecord.class, object); + final CSVRecord rec = (CSVRecord) object; + assertEquals(1L, rec.getRecordNumber()); + assertEquals("One", rec.get(0)); + assertEquals("Two", rec.get(1)); + assertEquals(2, rec.size()); + // The comment and whitespace are ignored so this is not 17 but 4 + assertEquals(4, rec.getCharacterPosition()); + assertEquals("my comment", rec.getComment()); + // The parser is not serialized + assertNull(rec.getParser()); + // Check all header map functionality is absent + assertTrue(rec.isConsistent()); + assertFalse(rec.isMapped("A")); + assertFalse(rec.isSet("A")); + assertEquals(0, rec.toMap().size()); + // This will throw + assertThrows(IllegalStateException.class, () -> rec.get("A")); + } + } +} diff --git a/src/test/java/org/apache/commons/csv/issues/JiraCsv249Test.java b/src/test/java/org/apache/commons/csv/issues/JiraCsv249Test.java new file mode 100644 index 0000000000..4034b04bd7 --- /dev/null +++ b/src/test/java/org/apache/commons/csv/issues/JiraCsv249Test.java @@ -0,0 +1,55 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * https://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.commons.csv.issues; + +import static org.junit.jupiter.api.Assertions.assertEquals; + +import java.io.IOException; +import java.io.StringReader; +import java.io.StringWriter; +import java.util.List; + +import org.apache.commons.csv.CSVFormat; +import org.apache.commons.csv.CSVParser; +import org.apache.commons.csv.CSVPrinter; +import org.apache.commons.csv.CSVRecord; +import org.junit.jupiter.api.Test; + +class JiraCsv249Test { + + @Test + void testJiraCsv249() throws IOException { + final CSVFormat format = CSVFormat.DEFAULT.builder().setEscape('\\').get(); + final StringWriter stringWriter = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(stringWriter, format)) { + printer.printRecord("foo \\", "bar"); + } + final StringReader reader = new StringReader(stringWriter.toString()); + final List records; + try (CSVParser parser = CSVParser.builder().setReader(reader).setFormat(format).get()) { + records = parser.getRecords(); + } + records.forEach(record -> { + assertEquals("foo \\", record.get(0)); + assertEquals("bar", record.get(1)); + }); + + } +} diff --git a/src/test/java/org/apache/commons/csv/issues/JiraCsv253Test.java b/src/test/java/org/apache/commons/csv/issues/JiraCsv253Test.java new file mode 100644 index 0000000000..13bb6a8270 --- /dev/null +++ b/src/test/java/org/apache/commons/csv/issues/JiraCsv253Test.java @@ -0,0 +1,56 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * https://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.commons.csv.issues; + +import static org.apache.commons.csv.CsvAssertions.assertValuesEquals; + +import java.io.IOException; +import java.io.StringReader; +import java.util.Iterator; + +import org.apache.commons.csv.CSVFormat; +import org.apache.commons.csv.CSVParser; +import org.apache.commons.csv.CSVRecord; +import org.apache.commons.csv.QuoteMode; +import org.junit.jupiter.api.Test; + +/** + * Setting QuoteMode:ALL_NON_NULL or NON_NUMERIC can distinguish between empty string columns and absent value columns. + */ +class JiraCsv253Test { + + @Test + void testHandleAbsentValues() throws IOException { + // @formatter:off + final String source = + "\"John\",,\"Doe\"\n" + + ",\"AA\",123\n" + + "\"John\",90,\n" + + "\"\",,90"; + // @formatter:on + final CSVFormat csvFormat = CSVFormat.DEFAULT.builder().setQuoteMode(QuoteMode.NON_NUMERIC).get(); + try (CSVParser parser = csvFormat.parse(new StringReader(source))) { + final Iterator csvRecords = parser.iterator(); + assertValuesEquals(new String[] {"John", null, "Doe"}, csvRecords.next()); + assertValuesEquals(new String[] {null, "AA", "123"}, csvRecords.next()); + assertValuesEquals(new String[] {"John", "90", null}, csvRecords.next()); + assertValuesEquals(new String[] {"", null, "90"}, csvRecords.next()); + } + } +} diff --git a/src/test/java/org/apache/commons/csv/issues/JiraCsv254Test.java b/src/test/java/org/apache/commons/csv/issues/JiraCsv254Test.java new file mode 100644 index 0000000000..629b42ee6b --- /dev/null +++ b/src/test/java/org/apache/commons/csv/issues/JiraCsv254Test.java @@ -0,0 +1,52 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * https://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.commons.csv.issues; + +import static org.apache.commons.csv.CsvAssertions.assertValuesEquals; + +import java.io.BufferedReader; +import java.io.IOException; +import java.nio.charset.StandardCharsets; +import java.nio.file.Files; +import java.nio.file.Paths; +import java.util.Iterator; + +import org.apache.commons.csv.CSVFormat; +import org.apache.commons.csv.CSVParser; +import org.apache.commons.csv.CSVRecord; +import org.junit.jupiter.api.Test; + +/** + * Tests https://issues.apache.org/jira/browse/CSV-254. + */ +class JiraCsv254Test { + + @Test + void test() throws IOException { + final CSVFormat csvFormat = CSVFormat.POSTGRESQL_CSV; + try (BufferedReader reader = Files.newBufferedReader(Paths.get("src/test/resources/org/apache/commons/csv/CSV-254/csv-254.csv"), + StandardCharsets.UTF_8); CSVParser parser = csvFormat.parse(reader)) { + final Iterator csvRecords = parser.iterator(); + assertValuesEquals(new String[] { "AA", "33", null }, csvRecords.next()); + assertValuesEquals(new String[] { "AA", null, "" }, csvRecords.next()); + assertValuesEquals(new String[] { null, "33", "CC" }, csvRecords.next()); + } + } +} diff --git a/src/test/java/org/apache/commons/csv/issues/JiraCsv257Test.java b/src/test/java/org/apache/commons/csv/issues/JiraCsv257Test.java new file mode 100644 index 0000000000..4234a7a0fa --- /dev/null +++ b/src/test/java/org/apache/commons/csv/issues/JiraCsv257Test.java @@ -0,0 +1,84 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * https://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.commons.csv.issues; + +import static org.junit.jupiter.api.Assertions.assertThrows; + +import java.io.IOException; +import java.io.StringReader; + +import org.apache.commons.csv.CSVFormat; +import org.apache.commons.csv.CSVParser; +import org.junit.jupiter.api.Test; + +/** + * Tests https://issues.apache.org/jira/browse/CSV-257 + */ +class JiraCsv257Test { + + private static final String INPUT = ","; + + @Test + void testHeaderBuilder() throws IOException { + // @formatter:off + final CSVFormat format = CSVFormat.RFC4180.builder() + .setDelimiter(INPUT.charAt(0)) + .setHeader() + .setSkipHeaderRecord(true) + .setIgnoreSurroundingSpaces(true) + .get(); + // @formatter:on + // Document the current behavior: Throw a IllegalArgumentException is a header name is missing. + assertThrows(IllegalArgumentException.class, () -> { + try (CSVParser parser = CSVParser.parse(INPUT, format)) { + // empty + } + }); + } + + @Test + void testHeaderDepreacted() throws IOException { + // @formatter:off + final CSVFormat format = CSVFormat.RFC4180 + .withDelimiter(INPUT.charAt(0)) + .withFirstRecordAsHeader() + .withIgnoreSurroundingSpaces(); + // @formatter:on + // Document the current behavior: Throw a IllegalArgumentException is a header name is missing. + assertThrows(IllegalArgumentException.class, () -> { + try (CSVParser parser = new CSVParser(new StringReader(INPUT), format)) { + // empty + } + }); + } + + @Test + void testNoHeaderBuilder() throws IOException { + // @formatter:off + final CSVFormat format = CSVFormat.RFC4180.builder() + .setDelimiter(INPUT.charAt(0)) + .setIgnoreSurroundingSpaces(true) + .get(); + // @formatter:on + try (CSVParser parser = CSVParser.parse(INPUT, format)) { + // empty + } + } +} diff --git a/src/test/java/org/apache/commons/csv/issues/JiraCsv263Test.java b/src/test/java/org/apache/commons/csv/issues/JiraCsv263Test.java new file mode 100644 index 0000000000..18bb9580a3 --- /dev/null +++ b/src/test/java/org/apache/commons/csv/issues/JiraCsv263Test.java @@ -0,0 +1,79 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * https://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.commons.csv.issues; + +import static org.junit.jupiter.api.Assertions.assertEquals; + +import java.io.IOException; +import java.io.Reader; +import java.io.StringReader; + +import org.apache.commons.csv.CSVFormat; +import org.apache.commons.csv.QuoteMode; +import org.junit.jupiter.api.Test; + +/** + * Tests [CSV-263] Print from Reader with embedded quotes generates incorrect output. + */ +class JiraCsv263Test { + + @Test + void testPrintFromReaderWithQuotes() throws IOException { + // @formatter:off + final CSVFormat format = CSVFormat.RFC4180.builder() + .setDelimiter(',') + .setQuote('"') + .setEscape('?') + .setQuoteMode(QuoteMode.NON_NUMERIC) + .get(); + // @formatter:on + final StringBuilder out = new StringBuilder(); + + final Reader atStartOnly = new StringReader("\"a,b,c\r\nx,y,z"); + format.print(atStartOnly, out, true); + assertEquals("\"\"\"a,b,c\r\nx,y,z\"", out.toString()); + + final Reader atEndOnly = new StringReader("a,b,c\r\nx,y,z\""); + out.setLength(0); + format.print(atEndOnly, out, true); + assertEquals("\"a,b,c\r\nx,y,z\"\"\"", out.toString()); + + final Reader atBeginEnd = new StringReader("\"a,b,c\r\nx,y,z\""); + out.setLength(0); + format.print(atBeginEnd, out, true); + assertEquals("\"\"\"a,b,c\r\nx,y,z\"\"\"", out.toString()); + + final Reader embeddedBeginMiddle = new StringReader("\"a\",b,c\r\nx,\"y\",z"); + out.setLength(0); + format.print(embeddedBeginMiddle, out, true); + assertEquals("\"\"\"a\"\",b,c\r\nx,\"\"y\"\",z\"", out.toString()); + + final Reader embeddedMiddleEnd = new StringReader("a,\"b\",c\r\nx,y,\"z\""); + out.setLength(0); + format.print(embeddedMiddleEnd, out, true); + assertEquals("\"a,\"\"b\"\",c\r\nx,y,\"\"z\"\"\"", out.toString()); + + final Reader nested = new StringReader("a,\"b \"and\" c\",d"); + out.setLength(0); + format.print(nested, out, true); + assertEquals("\"a,\"\"b \"\"and\"\" c\"\",d\"", out.toString()); + } + +} diff --git a/src/test/java/org/apache/commons/csv/issues/JiraCsv264Test.java b/src/test/java/org/apache/commons/csv/issues/JiraCsv264Test.java new file mode 100644 index 0000000000..857e42cb8f --- /dev/null +++ b/src/test/java/org/apache/commons/csv/issues/JiraCsv264Test.java @@ -0,0 +1,89 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * https://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.commons.csv.issues; + +import static org.junit.jupiter.api.Assertions.assertThrows; + +import java.io.IOException; +import java.io.StringReader; + +import org.apache.commons.csv.CSVFormat; +import org.apache.commons.csv.CSVParser; +import org.apache.commons.csv.DuplicateHeaderMode; +import org.junit.jupiter.api.Test; + +/** + * When {@link CSVFormat#withHeader(String...)} is not null; duplicate headers + * with empty strings should not be allowed. + * + * @see Jira Ticker + */ +class JiraCsv264Test { + + private static final String CSV_STRING = "\"\",\"B\",\"\"\n" + + "\"1\",\"2\",\"3\"\n" + + "\"4\",\"5\",\"6\""; + + /** + * A CSV file with a random gap in the middle. + */ + private static final String CSV_STRING_GAP = "\"A\",\"B\",\"\",\"\",\"E\"\n" + + "\"1\",\"2\",\"\",\"\",\"5\"\n" + + "\"6\",\"7\",\"\",\"\",\"10\""; + + @Test + void testJiraCsv264() { + final CSVFormat csvFormat = CSVFormat.DEFAULT + .builder() + .setHeader() + .setDuplicateHeaderMode(DuplicateHeaderMode.DISALLOW) + .setAllowMissingColumnNames(true) + .get(); + try (StringReader reader = new StringReader(CSV_STRING)) { + assertThrows(IllegalArgumentException.class, () -> csvFormat.parse(reader)); + } + } + + @Test + void testJiraCsv264WithGapAllowEmpty() throws IOException { + final CSVFormat csvFormat = CSVFormat.DEFAULT + .builder() + .setHeader() + .setDuplicateHeaderMode(DuplicateHeaderMode.ALLOW_EMPTY) + .setAllowMissingColumnNames(true) + .get(); + try (StringReader reader = new StringReader(CSV_STRING_GAP); CSVParser parser = csvFormat.parse(reader)) { + // empty + } + } + + @Test + void testJiraCsv264WithGapDisallow() { + final CSVFormat csvFormat = CSVFormat.DEFAULT + .builder() + .setHeader() + .setDuplicateHeaderMode(DuplicateHeaderMode.DISALLOW) + .setAllowMissingColumnNames(true) + .get(); + try (StringReader reader = new StringReader(CSV_STRING_GAP)) { + assertThrows(IllegalArgumentException.class, () -> csvFormat.parse(reader)); + } + } +} diff --git a/src/test/java/org/apache/commons/csv/issues/JiraCsv265Test.java b/src/test/java/org/apache/commons/csv/issues/JiraCsv265Test.java new file mode 100644 index 0000000000..1bccad702f --- /dev/null +++ b/src/test/java/org/apache/commons/csv/issues/JiraCsv265Test.java @@ -0,0 +1,92 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * https://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.commons.csv.issues; + +import static org.junit.jupiter.api.Assertions.assertEquals; + +import java.io.IOException; +import java.io.StringReader; +import java.util.Iterator; + +import org.apache.commons.csv.CSVFormat; +import org.apache.commons.csv.CSVParser; +import org.apache.commons.csv.CSVRecord; +import org.junit.jupiter.api.Test; + +/** + * Tests [CSV-265] {@link CSVRecord#getCharacterPosition()} returns the correct position after encountering a comment. + */ +class JiraCsv265Test { + + @Test + void testCharacterPositionWithComments() throws IOException { + // @formatter:off + final String csv = + "# Comment1\n" + + "Header1,Header2\n" + + "# Comment2\n" + + "Value1,Value2\n" + + "# Comment3\n" + + "Value3,Value4\n" + + "# Comment4\n"; + final CSVFormat csvFormat = CSVFormat.DEFAULT.builder() + .setCommentMarker('#') + .setHeader() + .setSkipHeaderRecord(true) + .get(); + // @formatter:on + try (CSVParser parser = csvFormat.parse(new StringReader(csv))) { + final Iterator itr = parser.iterator(); + final CSVRecord record1 = itr.next(); + assertEquals(csv.indexOf("# Comment2"), record1.getCharacterPosition()); + final CSVRecord record2 = itr.next(); + assertEquals(csv.indexOf("# Comment3"), record2.getCharacterPosition()); + } + } + + @Test + void testCharacterPositionWithCommentsSpanningMultipleLines() throws IOException { + // @formatter:off + final String csv = + "# Comment1\n" + + "# Comment2\n" + + "Header1,Header2\n" + + "# Comment3\n" + + "# Comment4\n" + + "Value1,Value2\n" + + "# Comment5\n" + + "# Comment6\n" + + "Value3,Value4"; + final CSVFormat csvFormat = CSVFormat.DEFAULT.builder() + .setCommentMarker('#') + .setHeader() + .setSkipHeaderRecord(true) + .get(); + // @formatter:on + try (CSVParser parser = csvFormat.parse(new StringReader(csv))) { + final Iterator itr = parser.iterator(); + final CSVRecord record1 = itr.next(); + assertEquals(csv.indexOf("# Comment3"), record1.getCharacterPosition()); + final CSVRecord record2 = itr.next(); + assertEquals(csv.indexOf("# Comment5"), record2.getCharacterPosition()); + } + } + +} diff --git a/src/test/java/org/apache/commons/csv/issues/JiraCsv271Test.java b/src/test/java/org/apache/commons/csv/issues/JiraCsv271Test.java new file mode 100644 index 0000000000..0269dec5d1 --- /dev/null +++ b/src/test/java/org/apache/commons/csv/issues/JiraCsv271Test.java @@ -0,0 +1,56 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * https://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.commons.csv.issues; + +import static org.junit.jupiter.api.Assertions.assertEquals; + +import java.io.IOException; +import java.io.StringWriter; +import java.util.Arrays; + +import org.apache.commons.csv.CSVFormat; +import org.apache.commons.csv.CSVPrinter; +import org.junit.jupiter.api.Test; + +class JiraCsv271Test { + + @Test + void testJiraCsv271_withArray() throws IOException { + final CSVFormat csvFormat = CSVFormat.DEFAULT; + final StringWriter stringWriter = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(stringWriter, csvFormat)) { + printer.print("a"); + printer.printRecord("b", "c"); + } + assertEquals("a,b,c\r\n", stringWriter.toString()); + } + + @Test + void testJiraCsv271_withList() throws IOException { + final CSVFormat csvFormat = CSVFormat.DEFAULT; + final StringWriter stringWriter = new StringWriter(); + try (CSVPrinter printer = new CSVPrinter(stringWriter, csvFormat)) { + printer.print("a"); + printer.printRecord(Arrays.asList("b", "c")); + } + assertEquals("a,b,c\r\n", stringWriter.toString()); + } + +} diff --git a/src/test/java/org/apache/commons/csv/issues/JiraCsv288Test.java b/src/test/java/org/apache/commons/csv/issues/JiraCsv288Test.java new file mode 100644 index 0000000000..065ee6bb37 --- /dev/null +++ b/src/test/java/org/apache/commons/csv/issues/JiraCsv288Test.java @@ -0,0 +1,216 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * https://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.commons.csv.issues; + +import static org.junit.jupiter.api.Assertions.assertEquals; + +import java.io.IOException; +import java.io.Reader; +import java.io.StringReader; + +import org.apache.commons.csv.CSVFormat; +import org.apache.commons.csv.CSVParser; +import org.apache.commons.csv.CSVPrinter; +import org.apache.commons.csv.CSVRecord; +import org.junit.jupiter.api.Test; + +class JiraCsv288Test { + + private void print(final CSVRecord csvRecord, final CSVPrinter csvPrinter) throws IOException { + for (final String value : csvRecord) { + csvPrinter.print(value); + } + } + + @Test + // Before fix: + // expected: but was: + void testParseWithABADelimiter() throws Exception { + final Reader in = new StringReader("a|~|b|~|c|~|d|~||~|f"); + final StringBuilder stringBuilder = new StringBuilder(); + try (CSVPrinter csvPrinter = new CSVPrinter(stringBuilder, CSVFormat.EXCEL); + CSVParser parser = CSVParser.parse(in, CSVFormat.Builder.create().setDelimiter("|~|").get())) { + for (final CSVRecord csvRecord : parser) { + print(csvRecord, csvPrinter); + assertEquals("a,b,c,d,,f", stringBuilder.toString()); + } + } + } + + @Test + // Before fix: + // expected: but was: + void testParseWithDoublePipeDelimiter() throws Exception { + final Reader in = new StringReader("a||b||c||d||||f"); + final StringBuilder stringBuilder = new StringBuilder(); + try (CSVPrinter csvPrinter = new CSVPrinter(stringBuilder, CSVFormat.EXCEL); + CSVParser csvParser = CSVParser.parse(in, CSVFormat.Builder.create().setDelimiter("||").get())) { + for (final CSVRecord csvRecord : csvParser) { + print(csvRecord, csvPrinter); + assertEquals("a,b,c,d,,f", stringBuilder.toString()); + } + } + } + + @Test + // Regression, already passed before fix + + void testParseWithDoublePipeDelimiterDoubleCharValue() throws Exception { + final Reader in = new StringReader("a||bb||cc||dd||f"); + final StringBuilder stringBuilder = new StringBuilder(); + try (CSVPrinter csvPrinter = new CSVPrinter(stringBuilder, CSVFormat.EXCEL); + CSVParser csvParser = CSVParser.parse(in, CSVFormat.Builder.create().setDelimiter("||").get())) { + for (final CSVRecord csvRecord : csvParser) { + print(csvRecord, csvPrinter); + assertEquals("a,bb,cc,dd,f", stringBuilder.toString()); + } + } + } + + @Test + // Before fix: + // expected: but was: + void testParseWithDoublePipeDelimiterEndsWithDelimiter() throws Exception { + final Reader in = new StringReader("a||b||c||d||||f||"); + final StringBuilder stringBuilder = new StringBuilder(); + try (CSVPrinter csvPrinter = new CSVPrinter(stringBuilder, CSVFormat.EXCEL); + CSVParser csvParser = CSVParser.parse(in, CSVFormat.Builder.create().setDelimiter("||").get())) { + for (final CSVRecord csvRecord : csvParser) { + print(csvRecord, csvPrinter); + assertEquals("a,b,c,d,,f,", stringBuilder.toString()); + } + } + } + + @Test + // Before fix: + // expected: but was: + void testParseWithDoublePipeDelimiterQuoted() throws Exception { + final Reader in = new StringReader("a||\"b||c\"||d||||f"); + final StringBuilder stringBuilder = new StringBuilder(); + try (CSVPrinter csvPrinter = new CSVPrinter(stringBuilder, CSVFormat.EXCEL); + CSVParser csvParser = CSVParser.parse(in, CSVFormat.Builder.create().setDelimiter("||").get())) { + for (final CSVRecord csvRecord : csvParser) { + print(csvRecord, csvPrinter); + assertEquals("a,b||c,d,,f", stringBuilder.toString()); + } + } + } + + @Test + // Regression, already passed before fix + void testParseWithSinglePipeDelimiterEndsWithDelimiter() throws Exception { + final Reader in = new StringReader("a|b|c|d||f|"); + final StringBuilder stringBuilder = new StringBuilder(); + try (CSVPrinter csvPrinter = new CSVPrinter(stringBuilder, CSVFormat.EXCEL); + CSVParser csvParser = CSVParser.parse(in, CSVFormat.Builder.create().setDelimiter("|").get())) { + for (final CSVRecord csvRecord : csvParser) { + print(csvRecord, csvPrinter); + assertEquals("a,b,c,d,,f,", stringBuilder.toString()); + } + } + } + + @Test + // Before fix: + // expected: but was: + void testParseWithTriplePipeDelimiter() throws Exception { + final Reader in = new StringReader("a|||b|||c|||d||||||f"); + final StringBuilder stringBuilder = new StringBuilder(); + try (CSVPrinter csvPrinter = new CSVPrinter(stringBuilder, CSVFormat.EXCEL); + CSVParser csvParser = CSVParser.parse(in, CSVFormat.Builder.create().setDelimiter("|||").get())) { + for (final CSVRecord csvRecord : csvParser) { + print(csvRecord, csvPrinter); + assertEquals("a,b,c,d,,f", stringBuilder.toString()); + } + } + } + + @Test + // Regression, already passed before fix + void testParseWithTwoCharDelimiter1() throws Exception { + final Reader in = new StringReader("a~|b~|c~|d~|~|f"); + final StringBuilder stringBuilder = new StringBuilder(); + try (CSVPrinter csvPrinter = new CSVPrinter(stringBuilder, CSVFormat.EXCEL); + CSVParser csvParser = CSVParser.parse(in, CSVFormat.Builder.create().setDelimiter("~|").get())) { + for (final CSVRecord csvRecord : csvParser) { + print(csvRecord, csvPrinter); + assertEquals("a,b,c,d,,f", stringBuilder.toString()); + } + } + } + + @Test + // Regression, already passed before fix + void testParseWithTwoCharDelimiter2() throws Exception { + final Reader in = new StringReader("a~|b~|c~|d~|~|f~"); + final StringBuilder stringBuilder = new StringBuilder(); + try (CSVPrinter csvPrinter = new CSVPrinter(stringBuilder, CSVFormat.EXCEL); + CSVParser csvParser = CSVParser.parse(in, CSVFormat.Builder.create().setDelimiter("~|").get())) { + for (final CSVRecord csvRecord : csvParser) { + print(csvRecord, csvPrinter); + assertEquals("a,b,c,d,,f~", stringBuilder.toString()); + } + } + } + + @Test + // Regression, already passed before fix + void testParseWithTwoCharDelimiter3() throws Exception { + final Reader in = new StringReader("a~|b~|c~|d~|~|f|"); + final StringBuilder stringBuilder = new StringBuilder(); + try (CSVPrinter csvPrinter = new CSVPrinter(stringBuilder, CSVFormat.EXCEL); + CSVParser csvParser = CSVParser.parse(in, CSVFormat.Builder.create().setDelimiter("~|").get())) { + for (final CSVRecord csvRecord : csvParser) { + print(csvRecord, csvPrinter); + assertEquals("a,b,c,d,,f|", stringBuilder.toString()); + } + } + } + + @Test + // Regression, already passed before fix + void testParseWithTwoCharDelimiter4() throws Exception { + final Reader in = new StringReader("a~|b~|c~|d~|~|f~~||g"); + final StringBuilder stringBuilder = new StringBuilder(); + try (CSVPrinter csvPrinter = new CSVPrinter(stringBuilder, CSVFormat.EXCEL); + CSVParser csvParser = CSVParser.parse(in, CSVFormat.Builder.create().setDelimiter("~|").get())) { + for (final CSVRecord csvRecord : csvParser) { + print(csvRecord, csvPrinter); + assertEquals("a,b,c,d,,f~,|g", stringBuilder.toString()); + } + } + } + + @Test + // Before fix: + // expected: but was: + void testParseWithTwoCharDelimiterEndsWithDelimiter() throws Exception { + final Reader in = new StringReader("a~|b~|c~|d~|~|f~|"); + final StringBuilder stringBuilder = new StringBuilder(); + try (CSVPrinter csvPrinter = new CSVPrinter(stringBuilder, CSVFormat.EXCEL); + CSVParser csvParser = CSVParser.parse(in, CSVFormat.Builder.create().setDelimiter("~|").get())) { + for (final CSVRecord csvRecord : csvParser) { + print(csvRecord, csvPrinter); + assertEquals("a,b,c,d,,f,", stringBuilder.toString()); + } + } + } +} diff --git a/src/test/java/org/apache/commons/csv/issues/JiraCsv290Test.java b/src/test/java/org/apache/commons/csv/issues/JiraCsv290Test.java new file mode 100644 index 0000000000..f251eeb7a5 --- /dev/null +++ b/src/test/java/org/apache/commons/csv/issues/JiraCsv290Test.java @@ -0,0 +1,113 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * https://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.commons.csv.issues; + +import static org.junit.jupiter.api.Assertions.assertArrayEquals; +import static org.junit.jupiter.api.Assertions.assertEquals; +import static org.junit.jupiter.api.Assertions.assertNull; + +import java.io.InputStreamReader; +import java.io.StringReader; +import java.io.StringWriter; +import java.util.ArrayList; +import java.util.Iterator; +import java.util.List; +import java.util.stream.Collectors; + +import org.apache.commons.csv.CSVFormat; +import org.apache.commons.csv.CSVParser; +import org.apache.commons.csv.CSVPrinter; +import org.apache.commons.csv.CSVRecord; +import org.junit.jupiter.api.Test; + +// psql (14.5 (Homebrew)) +// +// create table COMMONS_CSV_PSQL_TEST (ID INTEGER, COL1 VARCHAR, COL2 VARCHAR, COL3 VARCHAR, COL4 VARCHAR); +// insert into COMMONS_CSV_PSQL_TEST select 1, 'abc', 'test line 1' || chr(10) || 'test line 2', null, ''; +// insert into COMMONS_CSV_PSQL_TEST select 2, 'xyz', '\b:' || chr(8) || ' \t:' || chr(9) || ' \n:' || chr(10) || ' \r:' || chr(13), 'a', 'b'; +// insert into COMMONS_CSV_PSQL_TEST values (3, 'a', 'b,c,d', '"quoted"', 'e'); +// copy COMMONS_CSV_PSQL_TEST TO '/tmp/psql.csv' WITH (FORMAT CSV); +// copy COMMONS_CSV_PSQL_TEST TO '/tmp/psql.tsv'; +// +// cat /tmp/psql.csv +// 1,abc,"test line 1 +// test line 2",,"" +// 2,xyz,"\b:^H \t: \n: +// \r:^M",a,b +// 3,a,"b,c,d","""quoted""",e +// +// cat /tmp/psql.tsv +// 1 abc test line 1\ntest line 2 \N +// 2 xyz \\b:\b \\t:\t \\n:\n \\r:\r a b +// 3 a b,c,d "quoted" e +// +class JiraCsv290Test { + + private void testHelper(final String fileName, final CSVFormat format) throws Exception { + List> content = new ArrayList<>(); + try (CSVParser csvParser = CSVParser.parse(new InputStreamReader(this.getClass().getResourceAsStream("/org/apache/commons/csv/CSV-290/" + fileName)), + format)) { + content = csvParser.stream().collect(Collectors.mapping(CSVRecord::toList, Collectors.toList())); + } + + assertEquals(3, content.size()); + + assertEquals("1", content.get(0).get(0)); + assertEquals("abc", content.get(0).get(1)); + assertEquals("test line 1\ntest line 2", content.get(0).get(2)); // new line + assertNull(content.get(0).get(3)); // null + assertEquals("", content.get(0).get(4)); + + assertEquals("2", content.get(1).get(0)); + assertEquals("\\b:\b \\t:\t \\n:\n \\r:\r", content.get(1).get(2)); // \b, \t, \n, \r + + assertEquals("3", content.get(2).get(0)); + assertEquals("b,c,d", content.get(2).get(2)); // value has comma + assertEquals("\"quoted\"", content.get(2).get(3)); // quoted + } + + @Test + void testPostgresqlCsv() throws Exception { + testHelper("psql.csv", CSVFormat.POSTGRESQL_CSV); + } + + @Test + void testPostgresqlText() throws Exception { + testHelper("psql.tsv", CSVFormat.POSTGRESQL_TEXT); + } + + @Test + void testWriteThenRead() throws Exception { + final StringWriter sw = new StringWriter(); + final CSVFormat format = CSVFormat.POSTGRESQL_CSV.builder().setHeader().setSkipHeaderRecord(true).get(); + try (CSVPrinter printer = new CSVPrinter(sw, format)) { + printer.printRecord("column1", "column2"); + printer.printRecord("v11", "v12"); + printer.printRecord("v21", "v22"); + printer.close(); + try (CSVParser parser = CSVParser.builder().setReader(new StringReader(sw.toString())).setFormat(format).get()) { + assertArrayEquals(new Object[] { "column1", "column2" }, parser.getHeaderNames().toArray()); + final Iterator i = parser.iterator(); + assertArrayEquals(new String[] { "v11", "v12" }, i.next().toList().toArray()); + assertArrayEquals(new String[] { "v21", "v22" }, i.next().toList().toArray()); + } + } + } +} diff --git a/src/test/java/org/apache/commons/csv/issues/JiraCsv294Test.java b/src/test/java/org/apache/commons/csv/issues/JiraCsv294Test.java new file mode 100644 index 0000000000..0e5de0751b --- /dev/null +++ b/src/test/java/org/apache/commons/csv/issues/JiraCsv294Test.java @@ -0,0 +1,81 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * https://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.commons.csv.issues; + +import static org.junit.jupiter.api.Assertions.assertEquals; +import static org.junit.jupiter.api.Assertions.assertTrue; + +import java.io.ByteArrayInputStream; +import java.io.ByteArrayOutputStream; +import java.io.IOException; +import java.io.InputStreamReader; +import java.io.OutputStreamWriter; +import java.nio.charset.StandardCharsets; +import java.util.List; + +import org.apache.commons.csv.CSVFormat; +import org.apache.commons.csv.CSVParser; +import org.apache.commons.csv.CSVPrinter; +import org.apache.commons.csv.CSVRecord; +import org.junit.jupiter.api.Test; + +class JiraCsv294Test { + + private static void testInternal(final CSVFormat format, final String expectedSubstring) throws IOException { + final ByteArrayOutputStream bos = new ByteArrayOutputStream(); + try (CSVPrinter printer = new CSVPrinter(new OutputStreamWriter(bos, StandardCharsets.UTF_8), format)) { + printer.printRecord("a", "b \"\"", "c"); + } + final byte[] written = bos.toByteArray(); + final String writtenString = new String(written, StandardCharsets.UTF_8); + assertTrue(writtenString.contains(expectedSubstring)); + try (CSVParser parser = CSVParser.builder().setReader(new InputStreamReader(new ByteArrayInputStream(written), StandardCharsets.UTF_8)) + .setFormat(format).get()) { + final List records = parser.getRecords(); + assertEquals(1, records.size()); + final CSVRecord record = records.get(0); + assertEquals("a", record.get(0)); + assertEquals("b \"\"", record.get(1)); + assertEquals("c", record.get(2)); + } + } + + @Test + void testDefaultCsvFormatWithBackslashEscapeWorks() throws IOException { + testInternal(CSVFormat.Builder.create().setEscape('\\').get(), ",\"b \\\"\\\"\","); + } + + @Test + void testDefaultCsvFormatWithNullEscapeWorks() throws IOException { + testInternal(CSVFormat.Builder.create().setEscape(null).get(), ",\"b \"\"\"\"\","); + } + + @Test + void testDefaultCsvFormatWithQuoteEscapeWorks() throws IOException { + // this one doesn't actually work but should behave like setEscape(null) + // Printer is writing the expected content but Parser is unable to consume it + testInternal(CSVFormat.Builder.create().setEscape('"').get(), ",\"b \"\"\"\"\","); + } + + @Test + void testDefaultCsvFormatWorks() throws IOException { + testInternal(CSVFormat.Builder.create().get(), ",\"b \"\"\"\"\","); + } +} diff --git a/src/test/java/org/apache/commons/csv/issues/JiraCsv93Test.java b/src/test/java/org/apache/commons/csv/issues/JiraCsv93Test.java new file mode 100644 index 0000000000..7816412265 --- /dev/null +++ b/src/test/java/org/apache/commons/csv/issues/JiraCsv93Test.java @@ -0,0 +1,152 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * https://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.commons.csv.issues; + +import static org.junit.jupiter.api.Assertions.assertEquals; + +import java.io.IOException; +import java.io.StringReader; + +import org.apache.commons.csv.CSVFormat; +import org.apache.commons.csv.CSVParser; +import org.apache.commons.csv.CSVRecord; +import org.apache.commons.csv.QuoteMode; +import org.junit.jupiter.api.Test; + +/** + * Add more tests about null value. + *

+ * QuoteMode:ALL_NON_NULL (Quotes all non-null fields, null will not be quoted but not null will be quoted). when + * withNullString("NULL"), NULL String value ("NULL") and null value (null) will be formatted as '"NULL",NULL'. So it + * also should be parsed as NULL String value and null value (["NULL", null]), It should be distinguish in parsing. And + * when don't set nullString in CSVFormat, String '"",' should be parsed as "" and null (["", null]) according to null + * will not be quoted but not null will be quoted. QuoteMode:NON_NUMERIC, same as ALL_NON_NULL. + *

+ *

+ * This can solve the problem of distinguishing between empty string columns and absent value columns which just like + * Jira CSV-253 to a certain extent. + *

+ */ +class JiraCsv93Test { + private static Object[] objects1 = {"abc", "", null, "a,b,c", 123}; + + private static Object[] objects2 = {"abc", "NULL", null, "a,b,c", 123}; + + private void every(final CSVFormat csvFormat, final Object[] objects, final String format, final String[] data) + throws IOException { + final String source = csvFormat.format(objects); + assertEquals(format, csvFormat.format(objects)); + try (CSVParser csvParser = csvFormat.parse(new StringReader(source))) { + final CSVRecord csvRecord = csvParser.iterator().next(); + for (int i = 0; i < data.length; i++) { + assertEquals(csvRecord.get(i), data[i]); + } + } + } + + @Test + void testWithNotSetNullString() throws IOException { + // @formatter:off + every(CSVFormat.DEFAULT, + objects1, + "abc,,,\"a,b,c\",123", + new String[]{"abc", "", "", "a,b,c", "123"}); + every(CSVFormat.DEFAULT.builder().setQuoteMode(QuoteMode.ALL).get(), + objects1, + "\"abc\",\"\",,\"a,b,c\",\"123\"", + new String[]{"abc", "", "", "a,b,c", "123"}); + every(CSVFormat.DEFAULT.builder().setQuoteMode(QuoteMode.ALL_NON_NULL).get(), + objects1, + "\"abc\",\"\",,\"a,b,c\",\"123\"", + new String[]{"abc", "", null, "a,b,c", "123"}); + every(CSVFormat.DEFAULT.builder().setQuoteMode(QuoteMode.MINIMAL).get(), + objects1, + "abc,,,\"a,b,c\",123", + new String[]{"abc", "", "", "a,b,c", "123"}); + every(CSVFormat.DEFAULT.builder().setEscape('?').setQuoteMode(QuoteMode.NONE).get(), + objects1, + "abc,,,a?,b?,c,123", + new String[]{"abc", "", "", "a,b,c", "123"}); + every(CSVFormat.DEFAULT.builder().setQuoteMode(QuoteMode.NON_NUMERIC).get(), + objects1, + "\"abc\",\"\",,\"a,b,c\",123", + new String[]{"abc", "", null, "a,b,c", "123"}); + // @formatter:on + } + + @Test + void testWithSetNullStringEmptyString() throws IOException { + // @formatter:off + every(CSVFormat.DEFAULT.builder().setNullString("").get(), + objects1, + "abc,,,\"a,b,c\",123", + new String[]{"abc", null, null, "a,b,c", "123"}); + every(CSVFormat.DEFAULT.builder().setNullString("").setQuoteMode(QuoteMode.ALL).get(), + objects1, + "\"abc\",\"\",\"\",\"a,b,c\",\"123\"", + new String[]{"abc", null, null, "a,b,c", "123"}); + every(CSVFormat.DEFAULT.builder().setNullString("").setQuoteMode(QuoteMode.ALL_NON_NULL).get(), + objects1, + "\"abc\",\"\",,\"a,b,c\",\"123\"", + new String[]{"abc", "", null, "a,b,c", "123"}); + every(CSVFormat.DEFAULT.builder().setNullString("").setQuoteMode(QuoteMode.MINIMAL).get(), + objects1, + "abc,,,\"a,b,c\",123", + new String[]{"abc", null, null, "a,b,c", "123"}); + every(CSVFormat.DEFAULT.builder().setNullString("").setEscape('?').setQuoteMode(QuoteMode.NONE).get(), + objects1, + "abc,,,a?,b?,c,123", + new String[]{"abc", null, null, "a,b,c", "123"}); + every(CSVFormat.DEFAULT.builder().setNullString("").setQuoteMode(QuoteMode.NON_NUMERIC).get(), + objects1, + "\"abc\",\"\",,\"a,b,c\",123", + new String[]{"abc", "", null, "a,b,c", "123"}); + // @formatter:on + } + + @Test + void testWithSetNullStringNULL() throws IOException { + // @formatter:off + every(CSVFormat.DEFAULT.builder().setNullString("NULL").get(), + objects2, + "abc,NULL,NULL,\"a,b,c\",123", + new String[]{"abc", null, null, "a,b,c", "123"}); + every(CSVFormat.DEFAULT.builder().setNullString("NULL").setQuoteMode(QuoteMode.ALL).get(), + objects2, + "\"abc\",\"NULL\",\"NULL\",\"a,b,c\",\"123\"", + new String[]{"abc", null, null, "a,b,c", "123"}); + every(CSVFormat.DEFAULT.builder().setNullString("NULL").setQuoteMode(QuoteMode.ALL_NON_NULL).get(), + objects2, + "\"abc\",\"NULL\",NULL,\"a,b,c\",\"123\"", + new String[]{"abc", "NULL", null, "a,b,c", "123"}); + every(CSVFormat.DEFAULT.builder().setNullString("NULL").setQuoteMode(QuoteMode.MINIMAL).get(), + objects2, + "abc,NULL,NULL,\"a,b,c\",123", + new String[]{"abc", null, null, "a,b,c", "123"}); + every(CSVFormat.DEFAULT.builder().setNullString("NULL").setEscape('?').setQuoteMode(QuoteMode.NONE).get(), + objects2, + "abc,NULL,NULL,a?,b?,c,123", + new String[]{"abc", null, null, "a,b,c", "123"}); + every(CSVFormat.DEFAULT.builder().setNullString("NULL").setQuoteMode(QuoteMode.NON_NUMERIC).get(), + objects2, + "\"abc\",\"NULL\",NULL,\"a,b,c\",123", + new String[]{"abc", "NULL", null, "a,b,c", "123"}); + // @formatter:on + } +} diff --git a/src/test/java/org/apache/commons/csv/perf/PerformanceTest.java b/src/test/java/org/apache/commons/csv/perf/PerformanceTest.java index 41b4d235b3..bead12378d 100644 --- a/src/test/java/org/apache/commons/csv/perf/PerformanceTest.java +++ b/src/test/java/org/apache/commons/csv/perf/PerformanceTest.java @@ -1,25 +1,26 @@ /* - * Licensed to the Apache Software Foundation (ASF) under one or more - * contributor license agreements. See the NOTICE file distributed with - * this work for additional information regarding copyright ownership. - * The ASF licenses this file to You under the Apache License, Version 2.0 - * (the "License"); you may not use this file except in compliance with - * the License. You may obtain a copy of the License at + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at * - * http://www.apache.org/licenses/LICENSE-2.0 + * https://www.apache.org/licenses/LICENSE-2.0 * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. */ package org.apache.commons.csv.perf; import java.io.BufferedReader; import java.io.File; -import java.io.FileInputStream; import java.io.FileNotFoundException; import java.io.FileOutputStream; import java.io.FileReader; @@ -30,52 +31,55 @@ import java.util.zip.GZIPInputStream; import org.apache.commons.csv.CSVFormat; +import org.apache.commons.csv.CSVParser; import org.apache.commons.csv.CSVRecord; +import org.apache.commons.io.FileUtils; import org.apache.commons.io.IOUtils; -import org.junit.BeforeClass; -import org.junit.Test; +import org.junit.jupiter.api.BeforeAll; +import org.junit.jupiter.api.Test; /** * Tests performance. * - * To run this test, use: mvn test -Dtest=PeformanceTest - * - * @version $Id$ + * To run this test, use: mvn test -Dtest=PerformanceTest */ -@SuppressWarnings("boxing") // test code -public class PerformanceTest { +class PerformanceTest { - private final int max = 10; + private static final String TEST_RESRC = "org/apache/commons/csv/perf/worldcitiespop.txt.gz"; - private static final File BIG_FILE = new File(System.getProperty("java.io.tmpdir"), "worldcitiespop.txt"); + private static final File BIG_FILE = new File(FileUtils.getTempDirectoryPath(), "worldcitiespop.txt"); - @BeforeClass + @BeforeAll public static void setUpClass() throws FileNotFoundException, IOException { if (BIG_FILE.exists()) { System.out.println(String.format("Found test fixture %s: %,d bytes.", BIG_FILE, BIG_FILE.length())); return; } - System.out.println("Decompressing test fixture " + BIG_FILE + "..."); - try (final InputStream input = new GZIPInputStream( - new FileInputStream("src/test/resources/perf/worldcitiespop.txt.gz")); - final OutputStream output = new FileOutputStream(BIG_FILE)) { + System.out.println("Decompressing test fixture to: " + BIG_FILE + "..."); + try (InputStream input = new GZIPInputStream(PerformanceTest.class.getClassLoader().getResourceAsStream(TEST_RESRC)); + OutputStream output = new FileOutputStream(BIG_FILE)) { IOUtils.copy(input, output); System.out.println(String.format("Decompressed test fixture %s: %,d bytes.", BIG_FILE, BIG_FILE.length())); } } + private final int max = 10; + private BufferedReader createBufferedReader() throws IOException { return new BufferedReader(new FileReader(BIG_FILE)); } - private long parse(final Reader in, final boolean traverseColumns) throws IOException { - final CSVFormat format = CSVFormat.DEFAULT.withIgnoreSurroundingSpaces(false); + private long parse(final Reader reader, final boolean traverseColumns) throws IOException { + final CSVFormat format = CSVFormat.DEFAULT.builder().setIgnoreSurroundingSpaces(false).get(); long recordCount = 0; - for (final CSVRecord record : format.parse(in)) { - recordCount++; - if (traverseColumns) { - for (@SuppressWarnings("unused") final String value : record) { - // do nothing for now + try (CSVParser parser = format.parse(reader)) { + for (final CSVRecord record : parser) { + recordCount++; + if (traverseColumns) { + for (@SuppressWarnings("unused") + final String value : record) { + // do nothing for now + } } } } @@ -86,7 +90,7 @@ private void println(final String s) { System.out.println(s); } - private long readAll(final BufferedReader in) throws IOException { + private long readLines(final BufferedReader in) throws IOException { long count = 0; while (in.readLine() != null) { count++; @@ -96,36 +100,38 @@ private long readAll(final BufferedReader in) throws IOException { public long testParseBigFile(final boolean traverseColumns) throws Exception { final long startMillis = System.currentTimeMillis(); - final long count = this.parse(this.createBufferedReader(), traverseColumns); - final long totalMillis = System.currentTimeMillis() - startMillis; - this.println(String.format("File parsed in %,d milliseconds with Commons CSV: %,d lines.", totalMillis, count)); - return totalMillis; + try (BufferedReader reader = createBufferedReader()) { + final long count = parse(reader, traverseColumns); + final long totalMillis = System.currentTimeMillis() - startMillis; + println( + String.format("File parsed in %,d milliseconds with Commons CSV: %,d lines.", totalMillis, count)); + return totalMillis; + } } @Test - public void testParseBigFileRepeat() throws Exception { + void testParseBigFileRepeat() throws Exception { long bestTime = Long.MAX_VALUE; for (int i = 0; i < this.max; i++) { - bestTime = Math.min(this.testParseBigFile(false), bestTime); + bestTime = Math.min(testParseBigFile(false), bestTime); } - this.println(String.format("Best time out of %,d is %,d milliseconds.", this.max, bestTime)); + println(String.format("Best time out of %,d is %,d milliseconds.", this.max, bestTime)); } @Test - public void testReadBigFile() throws Exception { + void testReadBigFile() throws Exception { long bestTime = Long.MAX_VALUE; + long count; for (int i = 0; i < this.max; i++) { final long startMillis; - long count; - try (final BufferedReader in = this.createBufferedReader()) { + try (BufferedReader in = createBufferedReader()) { startMillis = System.currentTimeMillis(); - count = 0; - count = this.readAll(in); + count = readLines(in); } final long totalMillis = System.currentTimeMillis() - startMillis; bestTime = Math.min(totalMillis, bestTime); - this.println(String.format("File read in %,d milliseconds: %,d lines.", totalMillis, count)); + println(String.format("File read in %,d milliseconds: %,d lines.", totalMillis, count)); } - this.println(String.format("Best time out of %,d is %,d milliseconds.", this.max, bestTime)); + println(String.format("Best time out of %,d is %,d milliseconds.", this.max, bestTime)); } -} \ No newline at end of file +} diff --git a/src/test/resources/org/apache/commons/csv/CSV-141/csv-141.csv b/src/test/resources/org/apache/commons/csv/CSV-141/csv-141.csv new file mode 100644 index 0000000000..e685adc88f --- /dev/null +++ b/src/test/resources/org/apache/commons/csv/CSV-141/csv-141.csv @@ -0,0 +1,4 @@ +"1414770317901","android.widget.EditText","pass sem1 _84*|*","0","pass sem1 _8" +"1414770318470","android.widget.EditText","pass sem1 _84:|","0","pass sem1 _84:\" +"1414770318327","android.widget.EditText","pass sem1 +"1414770318628","android.widget.EditText","pass sem1 _84*|*","0","pass sem1 diff --git a/src/test/resources/org/apache/commons/csv/CSV-196/emoji.csv b/src/test/resources/org/apache/commons/csv/CSV-196/emoji.csv new file mode 100644 index 0000000000..0bff7a44f3 --- /dev/null +++ b/src/test/resources/org/apache/commons/csv/CSV-196/emoji.csv @@ -0,0 +1,5 @@ +id,val1,val2,val3,val4,val5,val6,val7,val8,val9,val10,val11,val12,val13,val14,val15 +1,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄 +2,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄 +3,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄 +4,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄,😄😄😄😄😄😄😄😄😄😄 \ No newline at end of file diff --git a/src/test/resources/org/apache/commons/csv/CSV-196/japanese.csv b/src/test/resources/org/apache/commons/csv/CSV-196/japanese.csv new file mode 100644 index 0000000000..b06e04bd6a --- /dev/null +++ b/src/test/resources/org/apache/commons/csv/CSV-196/japanese.csv @@ -0,0 +1,4 @@ +id,date,val1,val2,val3,val4,val5,val6,val7,val8,val9,val10,val11,val12,val13,val14,val15 +00000000000001,2017-01-01,きちんと節分近くには咲いてる。自然の力ってすごいな～,v2,v3,v4,v5,v6,v7,v8,v9,v10,v11,v12,v13,v14,v15 +00000000000002,2017-01-01,きちんと節分近くには咲いてる。自然の力ってすごいな～,v2,v3,v4,v5,v6,v7,v8,v9,v10,v11,v12,v13,v14,v15 +00000000000003,2017-01-01,きちんと節分近くには咲いてる。自然の力ってすごいな～,v2,v3,v4,v5,v6,v7,v8,v9,v10,v11,v12,v13,v14,v15 \ No newline at end of file diff --git a/src/test/resources/CSV-198/optd_por_public.csv b/src/test/resources/org/apache/commons/csv/CSV-198/optd_por_public.csv similarity index 100% rename from src/test/resources/CSV-198/optd_por_public.csv rename to src/test/resources/org/apache/commons/csv/CSV-198/optd_por_public.csv diff --git a/src/test/resources/org/apache/commons/csv/CSV-213/999751170.patch.csv b/src/test/resources/org/apache/commons/csv/CSV-213/999751170.patch.csv new file mode 100644 index 0000000000..42a9d1f3a4 --- /dev/null +++ b/src/test/resources/org/apache/commons/csv/CSV-213/999751170.patch.csv @@ -0,0 +1,2 @@ +"CHANELID";"ADDRESS" +"27";"" diff --git a/src/test/resources/org/apache/commons/csv/CSV-248/csvRecord.bin b/src/test/resources/org/apache/commons/csv/CSV-248/csvRecord.bin new file mode 100644 index 0000000000..36047d5020 Binary files /dev/null and b/src/test/resources/org/apache/commons/csv/CSV-248/csvRecord.bin differ diff --git a/src/test/resources/org/apache/commons/csv/CSV-254/csv-254.csv b/src/test/resources/org/apache/commons/csv/CSV-254/csv-254.csv new file mode 100644 index 0000000000..e7d2972c5a --- /dev/null +++ b/src/test/resources/org/apache/commons/csv/CSV-254/csv-254.csv @@ -0,0 +1,3 @@ +AA,33, +AA,,"" +,33,CC diff --git a/src/test/resources/org/apache/commons/csv/CSV-259/sample.txt b/src/test/resources/org/apache/commons/csv/CSV-259/sample.txt new file mode 100644 index 0000000000..7d1adf7299 --- /dev/null +++ b/src/test/resources/org/apache/commons/csv/CSV-259/sample.txt @@ -0,0 +1 @@ +x,y,z \ No newline at end of file diff --git a/src/test/resources/org/apache/commons/csv/CSV-290/psql.csv b/src/test/resources/org/apache/commons/csv/CSV-290/psql.csv new file mode 100644 index 0000000000..dd50f5a642 --- /dev/null +++ b/src/test/resources/org/apache/commons/csv/CSV-290/psql.csv @@ -0,0 +1,5 @@ +1,abc,"test line 1 +test line 2",,"" +2,xyz,"\b: \t: \n: + \r: ",a,b +3,a,"b,c,d","""quoted""",e diff --git a/src/test/resources/org/apache/commons/csv/CSV-290/psql.tsv b/src/test/resources/org/apache/commons/csv/CSV-290/psql.tsv new file mode 100644 index 0000000000..5358d8eac6 --- /dev/null +++ b/src/test/resources/org/apache/commons/csv/CSV-290/psql.tsv @@ -0,0 +1,3 @@ +1 abc test line 1\ntest line 2 \N +2 xyz \\b:\b \\t:\t \\n:\n \\r:\r a b +3 a b,c,d "quoted" e diff --git a/src/test/resources/CSVFileParser/README.txt b/src/test/resources/org/apache/commons/csv/CSVFileParser/README.txt similarity index 100% rename from src/test/resources/CSVFileParser/README.txt rename to src/test/resources/org/apache/commons/csv/CSVFileParser/README.txt diff --git a/src/test/resources/CSVFileParser/bom.csv b/src/test/resources/org/apache/commons/csv/CSVFileParser/bom.csv similarity index 100% rename from src/test/resources/CSVFileParser/bom.csv rename to src/test/resources/org/apache/commons/csv/CSVFileParser/bom.csv diff --git a/src/test/resources/CSVFileParser/test.csv b/src/test/resources/org/apache/commons/csv/CSVFileParser/test.csv similarity index 93% rename from src/test/resources/CSVFileParser/test.csv rename to src/test/resources/org/apache/commons/csv/CSVFileParser/test.csv index ebdb952594..93101ed334 100644 --- a/src/test/resources/CSVFileParser/test.csv +++ b/src/test/resources/org/apache/commons/csv/CSVFileParser/test.csv @@ -1,16 +1,16 @@ -A,B,C,"D" -# plain values -a,b,c,d -# spaces before and after - e ,f , g,h -# quoted: with spaces before and after -" i ", " j " , " k "," l " -# empty values -,,, -# empty quoted values -"","","","" -# 3 empty lines - - - -# EOF on next line +A,B,C,"D" +# plain values +a,b,c,d +# spaces before and after + e ,f , g,h +# quoted: with spaces before and after +" i ", " j " , " k "," l " +# empty values +,,, +# empty quoted values +"","","","" +# 3 empty lines + + + +# EOF on next line diff --git a/src/test/resources/org/apache/commons/csv/CSVFileParser/testCSV246.csv b/src/test/resources/org/apache/commons/csv/CSVFileParser/testCSV246.csv new file mode 100644 index 0000000000..f01a214708 --- /dev/null +++ b/src/test/resources/org/apache/commons/csv/CSVFileParser/testCSV246.csv @@ -0,0 +1,8 @@ +a,b,c,e,f +# Very Long +# Comment 2 +g,h,i,j,k +# Very Long + +# Comment 3 +l,m,n,o,p \ No newline at end of file diff --git a/src/test/resources/org/apache/commons/csv/CSVFileParser/testCSV246_checkWithNoComment.txt b/src/test/resources/org/apache/commons/csv/CSVFileParser/testCSV246_checkWithNoComment.txt new file mode 100644 index 0000000000..cb7d21f01a --- /dev/null +++ b/src/test/resources/org/apache/commons/csv/CSVFileParser/testCSV246_checkWithNoComment.txt @@ -0,0 +1,10 @@ +testCSV246.csv CommentStart=# CheckComments +Delimiter=<,> QuoteChar=<"> CommentStart=<#> SkipHeaderRecord:false +5:[a, b, c, e, f] +# Very Long +# Comment 2 +5:[g, h, i, j, k]#Very Long\nComment 2 +# Very Long +1:[]#Very Long +# Comment 3 +5:[l, m, n, o, p]#Comment 3 \ No newline at end of file diff --git a/src/test/resources/CSVFileParser/testCSV85.csv b/src/test/resources/org/apache/commons/csv/CSVFileParser/testCSV85.csv similarity index 91% rename from src/test/resources/CSVFileParser/testCSV85.csv rename to src/test/resources/org/apache/commons/csv/CSVFileParser/testCSV85.csv index b1baab30cf..69bb80e3b6 100644 --- a/src/test/resources/CSVFileParser/testCSV85.csv +++ b/src/test/resources/org/apache/commons/csv/CSVFileParser/testCSV85.csv @@ -1,9 +1,9 @@ -# Comment 1 -a,b,c,e,f -# Very Long -# Comment 2 -g,h,i,j,k -# Very Long - -# Comment 3 +# Comment 1 +a,b,c,e,f +# Very Long +# Comment 2 +g,h,i,j,k +# Very Long + +# Comment 3 l,m,n,o,p \ No newline at end of file diff --git a/src/test/resources/CSVFileParser/testCSV85_default.txt b/src/test/resources/org/apache/commons/csv/CSVFileParser/testCSV85_default.txt similarity index 100% rename from src/test/resources/CSVFileParser/testCSV85_default.txt rename to src/test/resources/org/apache/commons/csv/CSVFileParser/testCSV85_default.txt diff --git a/src/test/resources/CSVFileParser/testCSV85_ignoreEmpty.txt b/src/test/resources/org/apache/commons/csv/CSVFileParser/testCSV85_ignoreEmpty.txt similarity index 100% rename from src/test/resources/CSVFileParser/testCSV85_ignoreEmpty.txt rename to src/test/resources/org/apache/commons/csv/CSVFileParser/testCSV85_ignoreEmpty.txt diff --git a/src/test/resources/CSVFileParser/test_default.txt b/src/test/resources/org/apache/commons/csv/CSVFileParser/test_default.txt similarity index 100% rename from src/test/resources/CSVFileParser/test_default.txt rename to src/test/resources/org/apache/commons/csv/CSVFileParser/test_default.txt diff --git a/src/test/resources/CSVFileParser/test_default_comment.txt b/src/test/resources/org/apache/commons/csv/CSVFileParser/test_default_comment.txt similarity index 100% rename from src/test/resources/CSVFileParser/test_default_comment.txt rename to src/test/resources/org/apache/commons/csv/CSVFileParser/test_default_comment.txt diff --git a/src/test/resources/CSVFileParser/test_rfc4180.txt b/src/test/resources/org/apache/commons/csv/CSVFileParser/test_rfc4180.txt similarity index 100% rename from src/test/resources/CSVFileParser/test_rfc4180.txt rename to src/test/resources/org/apache/commons/csv/CSVFileParser/test_rfc4180.txt diff --git a/src/test/resources/CSVFileParser/test_rfc4180_trim.txt b/src/test/resources/org/apache/commons/csv/CSVFileParser/test_rfc4180_trim.txt similarity index 100% rename from src/test/resources/CSVFileParser/test_rfc4180_trim.txt rename to src/test/resources/org/apache/commons/csv/CSVFileParser/test_rfc4180_trim.txt diff --git a/src/test/resources/csv-167/sample1.csv b/src/test/resources/org/apache/commons/csv/csv-167/sample1.csv similarity index 100% rename from src/test/resources/csv-167/sample1.csv rename to src/test/resources/org/apache/commons/csv/csv-167/sample1.csv diff --git a/src/test/resources/org/apache/commons/csv/empty.txt b/src/test/resources/org/apache/commons/csv/empty.txt new file mode 100644 index 0000000000..e69de29bb2 diff --git a/src/test/resources/perf/worldcitiespop.txt.gz b/src/test/resources/org/apache/commons/csv/perf/worldcitiespop.txt.gz similarity index 100% rename from src/test/resources/perf/worldcitiespop.txt.gz rename to src/test/resources/org/apache/commons/csv/perf/worldcitiespop.txt.gz

Using predefined formats

Defining formats

Defining column names

Parsing

Referencing columns safely

Notes

Serialization

Notes

Creating instances

Parsing into memory

Notes

Apache Commons CSV

Introducing Commons CSV

Parsing Standard CSV Files

Parsing an Excel CSV File

Parsing Custom CSV Files

Handling Byte Order Marks

Using Headers

Accessing column values by index

Defining a header manually

Using an enum to define a header

Header auto detection

Printing with headers

Working with JDBC

Exporting JDBC Result Sets

Limiting rows from JDBC Result Sets

Apache Commons CSV User Guide