I'm dealing with a WARC archive that was transformed from ARC files. When I was running Archives Unleashed Toolkit on it I ran into issues with unparseable dates. It is described here archivesunleashed/aut#163.
I have fixed the dates from their YYYYmmddHHMM format to a proper ISO-8601. But now I'm getting unexpected extra data after record org.archive.io.warc.WARCRecord errors for all those files.
Any suggestions on what else needs to be fixed?
I'm dealing with a WARC archive that was transformed from ARC files. When I was running Archives Unleashed Toolkit on it I ran into issues with unparseable dates. It is described here archivesunleashed/aut#163.
I have fixed the dates from their YYYYmmddHHMM format to a proper ISO-8601. But now I'm getting
unexpected extra data after record org.archive.io.warc.WARCRecorderrors for all those files.Any suggestions on what else needs to be fixed?