Skip to content

WAT extractor: do not fail on missing WARC-Filename in warcinfo record#89

Merged
ldko merged 2 commits into
iipc:masterfrom
sebastian-nagel:webarchive-commons-88
Jun 15, 2020
Merged

WAT extractor: do not fail on missing WARC-Filename in warcinfo record#89
ldko merged 2 commits into
iipc:masterfrom
sebastian-nagel:webarchive-commons-88

Conversation

@sebastian-nagel

Copy link
Copy Markdown
Collaborator

fixes #88

  • do not throw IOException if there is no WARC-Filename in warcinfo record
  • write metadata record (corresponding to warcinfo) without WARC-Target-URI

fixes iipc#88

- do not throw IOException if there is no WARC-Filename in warcinfo record
- write metadata record (corresponding to warcinfo) without WARC-Target-URI

@ldko ldko left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Sebastian, thanks for submitting this and all the details in the issue as well. The fix make sense to me and worked to fix the issue when I tried it with the sample WARC you provided in #88. If you have a moment, I noticed that the CHANGES.md does not reflect the last couple fixes you made, would you be willing to update the file to include this fix and #85 and #86 under a 1.1.10 heading? If not, I will take care of it. Thank you again!

@sebastian-nagel

Copy link
Copy Markdown
Collaborator Author

@ldko - updated the change log. Thanks!

@ldko ldko left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wonderful, thank you, @sebastian-nagel

@ldko ldko merged commit 21c5cc4 into iipc:master Jun 15, 2020
@sebastian-nagel sebastian-nagel deleted the webarchive-commons-88 branch June 15, 2020 14:43
sebastian-nagel added a commit to commoncrawl/ia-web-commons that referenced this pull request Jun 15, 2020
sebastian-nagel pushed a commit to commoncrawl/ia-web-commons that referenced this pull request Jun 15, 2020
fixes #23, closes #24

WAT extractor: do not fail on missing WARC-Filename in warcinfo record
sebastian-nagel added a commit to commoncrawl/ia-web-commons that referenced this pull request Jun 15, 2020
sebastian-nagel pushed a commit to commoncrawl/ia-web-commons that referenced this pull request Jun 15, 2020
fixes #23, closes #24

WAT extractor: do not fail on missing WARC-Filename in warcinfo record
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

WAT extractor: do not fail on missing WARC-Filename in warcinfo record

2 participants