Skip to content

[CSV-239] Add CSVRecord.getHeaderNames and allow duplicate headers#41

Merged
garydgregory merged 6 commits into
apache:masterfrom
davidmoten:headers
May 24, 2019
Merged

[CSV-239] Add CSVRecord.getHeaderNames and allow duplicate headers#41
garydgregory merged 6 commits into
apache:masterfrom
davidmoten:headers

Conversation

@davidmoten

Copy link
Copy Markdown
Contributor

These are the changes:

  • add getHeaderNames returns all headers in column order including repeats which are allowed in general as per RFC 4180
  • add CSVFormat.withAllowDuplicateHeaderNames(). CSVFormat.DEFAULT now allows duplicate header names because RFC 4190 allows non-unique header names. This is a behavioural change but not a breaking API change anywhere because there is no API contract for it (e.g. javadoc). Because CSVFormat.DEFAULT should reflect RFC 4190 I'd classify this as a bug fix.
  • CSVFormat is Serializable which means adding new fields to it (allowDuplicateHeaderNames) is theoretically a breaking change. I propose we allow this minor breaking change and also propose that CSVFormat does not implement Serializable in 2.x
  • fix CSVRecord.toMap javadoc
  • fix bug in CSVParser where an IAE is thrown with a message about duplicate headers when the problem was actually a missing header name
  • add test coverage

Question:

  • do we need to talk about HeaderNames when we could just say Header?

Not addressed:

  • would be nice if CSVRecord.toMap returned a Map whose entries are iterable in column order but this involves quite a bit of rework so will leave for another PR (probably for 2.x).
  • CSVRecord.get(String) should ideally throw when two columns with that header name exist

Notes for 2.x:

  • for consistency CSVFormat.withAllowMissingColumnNames should be CSVFormat.withAllowMissingHeaderNames
  • remove Serializable from CSVFormat
  • CSVFormat.withIgnoreHeaderCase creates problems and lacks flexibility. I'd suggest CSVRecord.getIgnoreCase(int) instead

* getHeaderNames returns all headers in column order including repeats which are allowed as per RFC 4180
* add CSVFormat.withAllowDuplicateHeaderNames()
* only wrap headerNames with unmodifiableList if non-empty
* fix and enhance CSVRecord.toMap javadoc
@coveralls

coveralls commented May 22, 2019

Copy link
Copy Markdown

Coverage Status

Coverage increased (+0.05%) to 90.807% when pulling 2a5ddb7 on davidmoten:headers into 4d2616b on apache:master.

@garydgregory garydgregory left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @davidmoten

Thank you for your PR. See my comments/requests.

Gary

Comment thread src/main/java/org/apache/commons/csv/CSVFormat.java Outdated
Comment thread src/main/java/org/apache/commons/csv/CSVParser.java
Comment thread src/main/java/org/apache/commons/csv/CSVParser.java Outdated
Comment thread src/main/java/org/apache/commons/csv/CSVParser.java Outdated
Comment thread src/main/java/org/apache/commons/csv/CSVParser.java Outdated
Comment thread src/test/java/org/apache/commons/csv/CSVParserTest.java
* fix indentation
* add javadoc to Headers class
* rename method to createHeaders
* use String.format to build error message
* initialize header names List with appropriate size
@davidmoten

Copy link
Copy Markdown
Contributor Author

Thanks for the review, I've addressed all comments with the latest commit.

@davidmoten

Copy link
Copy Markdown
Contributor Author

BTW @garydgregory , would you like me to fix these mvn warnings while I'm here:

[WARNING] 'build.plugins.plugin.version' for org.apache.maven.plugins:maven-compiler-plugin is missing. @ line 164, column 15
[WARNING] 'build.plugins.plugin.version' for org.apache.maven.plugins:maven-surefire-plugin is missing. @ line 182, column 15
[WARNING] 'build.plugins.plugin.version' for org.apache.rat:apache-rat-plugin is missing. @ line 204, column 15

I would just specify the latest versions explicitly for those plugins.

@garydgregory

Copy link
Copy Markdown
Member

@davidmoten I do not see these warnings. What command do you run to get these?

@davidmoten

Copy link
Copy Markdown
Contributor Author

mvn clean install. Warnings appear at start of any mvn call actually.

@garydgregory garydgregory merged commit 030fb8e into apache:master May 24, 2019
@garydgregory

Copy link
Copy Markdown
Member

Merged. I do not see your warnings. Here is my output:

C:\git\commons-csv>mvn -V clean install
Apache Maven 3.6.1 (d66c9c0b3152b2e69ee9bac180bb8fcc8e6af555; 2019-04-04T15:00:29-04:00)
Maven home: C:\Java\apache-maven-3.6.1\bin\..
Java version: 1.8.0_212, vendor: Oracle Corporation, runtime: C:\Program Files\Java\jdk1.8.0_212\jre
Default locale: en_US, platform encoding: Cp1252
OS name: "windows 10", version: "10.0", arch: "amd64", family: "windows"
[INFO] Scanning for projects...
[INFO]
[INFO] -------------------< org.apache.commons:commons-csv >-------------------
[INFO] Building Apache Commons CSV 1.7-SNAPSHOT
[INFO] --------------------------------[ jar ]---------------------------------
[INFO]
[INFO] --- maven-clean-plugin:3.1.0:clean (default-clean) @ commons-csv ---
[INFO] Deleting C:\git\commons-csv\target
[INFO]
[INFO] --- maven-enforcer-plugin:3.0.0-M2:enforce (enforce-maven-version) @ commons-csv ---
[INFO]
[INFO] --- maven-enforcer-plugin:3.0.0-M2:enforce (enforce-maven-3) @ commons-csv ---
[INFO]
[INFO] --- apache-rat-plugin:0.13:check (rat-check) @ commons-csv ---
[INFO] Enabled default license matchers.
[INFO] Will parse SCM ignores for exclusions...
[INFO] Parsing exclusions from C:\git\commons-csv\.gitignore
[INFO] Finished adding exclusions from SCM ignore files.
[INFO] 70 implicit excludes (use -debug for more details).
[INFO] 23 explicit excludes (use -debug for more details).
[INFO] 55 resources included (use -debug for more details)
[INFO] Rat check: Summary over all files. Unapproved: 0, unknown: 0, generated: 0, approved: 48 licenses.
[INFO]
[INFO] --- build-helper-maven-plugin:3.0.0:parse-version (parse-version) @ commons-csv ---
[INFO]
[INFO] --- maven-antrun-plugin:1.8:run (javadoc.resources) @ commons-csv ---
[INFO] Executing tasks

main:
     [copy] Copying 2 files to C:\git\commons-csv\target\apidocs\META-INF
[INFO] Executed tasks
[INFO]
[INFO] --- maven-remote-resources-plugin:1.5:process (process-resource-bundles) @ commons-csv ---
[INFO]
[INFO] --- buildnumber-maven-plugin:1.4:create (default) @ commons-csv ---
[INFO] Executing: cmd.exe /X /C "git rev-parse --verify HEAD"
[INFO] Working directory: C:\git\commons-csv
[INFO] Storing buildNumber: 030fb8e37c4024b24fac2b5404300449a6741699 at timestamp: 1558700454294
[INFO] Storing buildScmBranch: master
[INFO]
[INFO] --- maven-resources-plugin:3.1.0:resources (default-resources) @ commons-csv ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory C:\git\commons-csv\src\main\resources
[INFO] Copying 2 resources to META-INF
[INFO]
[INFO] --- maven-compiler-plugin:3.8.0:compile (default-compile) @ commons-csv ---
[INFO] Changes detected - recompiling the module!
[INFO] Compiling 12 source files to C:\git\commons-csv\target\classes
[INFO]
[INFO] --- maven-bundle-plugin:4.1.0:manifest (bundle-manifest) @ commons-csv ---
[INFO]
[INFO] --- animal-sniffer-maven-plugin:1.17:check (checkAPIcompatibility) @ commons-csv ---
[INFO] Checking unresolved references to org.codehaus.mojo.signature:java18:1.0
[INFO]
[INFO] --- maven-resources-plugin:3.1.0:testResources (default-testResources) @ commons-csv ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 14 resources
[INFO] Copying 2 resources to META-INF
[INFO]
[INFO] --- maven-compiler-plugin:3.8.0:testCompile (default-testCompile) @ commons-csv ---
[INFO] Changes detected - recompiling the module!
[INFO] Compiling 19 source files to C:\git\commons-csv\target\test-classes
[INFO]
[INFO] --- jacoco-maven-plugin:0.8.3:prepare-agent (prepare-agent) @ commons-csv ---
[INFO] argLine set to -javaagent:C:\\Users\\ggregory\\.m2\\repository\\org\\jacoco\\org.jacoco.agent\\0.8.3\\org.jacoco.agent-0.8.3-runtime.jar=destfile=C:\\git\\commons-csv\\target\\jacoco.exec
[INFO]
[INFO] --- maven-surefire-plugin:2.22.1:test (default-test) @ commons-csv ---
[INFO]
[INFO] -------------------------------------------------------
[INFO]  T E S T S
[INFO] -------------------------------------------------------
[INFO] Running org.apache.commons.csv.AssertionsTest
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.066 s - in org.apache.commons.csv.AssertionsTest
[INFO] Running org.apache.commons.csv.CSVFileParserTest
[INFO] Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.041 s - in org.apache.commons.csv.CSVFileParserTest
[INFO] Running org.apache.commons.csv.CSVFormatPredefinedTest
[INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.004 s - in org.apache.commons.csv.CSVFormatPredefinedTest
[INFO] Running org.apache.commons.csv.CSVFormatTest
[INFO] Tests run: 56, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.074 s - in org.apache.commons.csv.CSVFormatTest
[INFO] Running org.apache.commons.csv.CSVParserTest
[WARNING] Tests run: 81, Failures: 0, Errors: 0, Skipped: 4, Time elapsed: 0.051 s - in org.apache.commons.csv.CSVParserTest
[INFO] Running org.apache.commons.csv.CSVPrinterTest
\N
\N
\N
\N
\N
\N
\N
\N
\N
\N
\N
\N
\N
\N
\N
\N
\N
\N
\N
\N
\N
\N
\N
[WARNING] Tests run: 99, Failures: 0, Errors: 0, Skipped: 10, Time elapsed: 3.38 s - in org.apache.commons.csv.CSVPrinterTest
[INFO] Running org.apache.commons.csv.CSVRecordTest
[INFO] Tests run: 18, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.003 s - in org.apache.commons.csv.CSVRecordTest
[INFO] Running org.apache.commons.csv.ExtendedBufferedReaderTest
[INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.005 s - in org.apache.commons.csv.ExtendedBufferedReaderTest
[INFO] Running org.apache.commons.csv.issues.JiraCsv164Test
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.013 s - in org.apache.commons.csv.issues.JiraCsv164Test
[INFO] Running org.apache.commons.csv.issues.JiraCsv167Test
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.01 s - in org.apache.commons.csv.issues.JiraCsv167Test
[INFO] Running org.apache.commons.csv.issues.JiraCsv198Test
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.212 s - in org.apache.commons.csv.issues.JiraCsv198Test
[INFO] Running org.apache.commons.csv.issues.JiraCsv203Test
[INFO] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.003 s - in org.apache.commons.csv.issues.JiraCsv203Test
[INFO] Running org.apache.commons.csv.issues.JiraCsv213Test
[WARNING] Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 0.005 s - in org.apache.commons.csv.issues.JiraCsv213Test
[INFO] Running org.apache.commons.csv.LexerTest
[INFO] Tests run: 26, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.008 s - in org.apache.commons.csv.LexerTest
[INFO] Running org.apache.commons.csv.TokenMatchersTest
[INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.001 s - in org.apache.commons.csv.TokenMatchersTest
[INFO]
[INFO] Results:
[INFO]
[WARNING] Tests run: 325, Failures: 0, Errors: 0, Skipped: 15
[INFO]
[INFO]
[INFO] --- maven-jar-plugin:3.1.1:jar (default-jar) @ commons-csv ---
[INFO] Building jar: C:\git\commons-csv\target\commons-csv-1.7-SNAPSHOT.jar
[INFO]
[INFO] --- maven-site-plugin:3.7.1:attach-descriptor (attach-descriptor) @ commons-csv ---
[INFO] Skipping because packaging 'jar' is not pom.
[INFO]
[INFO] --- maven-jar-plugin:3.1.1:test-jar (default) @ commons-csv ---
[INFO] Building jar: C:\git\commons-csv\target\commons-csv-1.7-SNAPSHOT-tests.jar
[INFO]
[INFO] --- maven-source-plugin:3.0.1:jar-no-fork (create-source-jar) @ commons-csv ---
[INFO] Building jar: C:\git\commons-csv\target\commons-csv-1.7-SNAPSHOT-sources.jar
[INFO]
[INFO] --- maven-source-plugin:3.0.1:test-jar-no-fork (create-source-jar) @ commons-csv ---
[INFO] Building jar: C:\git\commons-csv\target\commons-csv-1.7-SNAPSHOT-test-sources.jar
[INFO]
[INFO] --- jacoco-maven-plugin:0.8.3:check (check) @ commons-csv ---
[INFO] Loading execution data file C:\git\commons-csv\target\jacoco.exec
[INFO] Analyzed bundle 'commons-csv' with 15 classes
[WARNING] Rule violated for bundle commons-csv: classes covered ratio is 0.93, but expected minimum is 1.00
[WARNING] Rule violated for bundle commons-csv: instructions covered ratio is 0.89, but expected minimum is 0.90
[WARNING] Rule violated for bundle commons-csv: methods covered ratio is 0.94, but expected minimum is 0.95
[WARNING] Rule violated for bundle commons-csv: complexity covered ratio is 0.84, but expected minimum is 0.85
[WARNING] Coverage checks have not been met. See log for details.
[INFO]
[INFO] --- maven-install-plugin:2.5.2:install (default-install) @ commons-csv ---
[INFO] Installing C:\git\commons-csv\target\commons-csv-1.7-SNAPSHOT.jar to C:\Users\ggregory\.m2\repository\org\apache\commons\commons-csv\1.7-SNAPSHOT\commons-csv-1.7-SNAPSHOT.jar
[INFO] Installing C:\git\commons-csv\pom.xml to C:\Users\ggregory\.m2\repository\org\apache\commons\commons-csv\1.7-SNAPSHOT\commons-csv-1.7-SNAPSHOT.pom
[INFO] Installing C:\git\commons-csv\target\commons-csv-1.7-SNAPSHOT-tests.jar to C:\Users\ggregory\.m2\repository\org\apache\commons\commons-csv\1.7-SNAPSHOT\commons-csv-1.7-SNAPSHOT-tests.jar
[INFO] Installing C:\git\commons-csv\target\commons-csv-1.7-SNAPSHOT-sources.jar to C:\Users\ggregory\.m2\repository\org\apache\commons\commons-csv\1.7-SNAPSHOT\commons-csv-1.7-SNAPSHOT-sources.jar
[INFO] Installing C:\git\commons-csv\target\commons-csv-1.7-SNAPSHOT-test-sources.jar to C:\Users\ggregory\.m2\repository\org\apache\commons\commons-csv\1.7-SNAPSHOT\commons-csv-1.7-SNAPSHOT-test-sources.jar
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  19.972 s
[INFO] Finished at: 2019-05-24T08:21:10-04:00
[INFO] ------------------------------------------------------------------------

@davidmoten

Copy link
Copy Markdown
Contributor Author

Yeah no problem with the warnings. Not sure how I provoked them! Thanks for the merge. The next obvious question is when will you build a release?

@garydgregory

garydgregory commented May 25, 2019 via email

Copy link
Copy Markdown
Member

@garydgregory

Copy link
Copy Markdown
Member

Would you mind providing another PR that adds Javadocs to the public methods you added in CSVFormat?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants