Skip to content

Adapt jobs for multi-valued headers in WAT data#46

Merged
sebastian-nagel merged 1 commit into
mainfrom
wat-multivalue-headers
Dec 17, 2024
Merged

Adapt jobs for multi-valued headers in WAT data#46
sebastian-nagel merged 1 commit into
mainfrom
wat-multivalue-headers

Conversation

@sebastian-nagel

Copy link
Copy Markdown
Contributor

Header data from WARC and HTTP headers will become multi-valued in WAT files. That is, the value is either a string or a list of strings. Jobs reading WAT files need to be adapted to the new data format.

See commoncrawl/ia-web-commons#18 and commoncrawl/ia-web-commons#38 for further details about multi-valued headers in WAT files.

Header data from WARC and HTTP headers will become multi-valued
in WAT files. That is, the value is either a string or a list of
strings. Jobs reading WAT files need to be adapted to the new data
format.
@sebastian-nagel sebastian-nagel merged commit 16e2064 into main Dec 17, 2024
@sebastian-nagel sebastian-nagel deleted the wat-multivalue-headers branch December 20, 2024 10:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant