Skip to content

WarcRecordWriter performance improvements #8

Description

@sebastian-nagel

According to async-profiler, when writing WARC files the CPU time is mostly spent for:

  • gzip compression
  • SHA digests
  • language detection
  • charset detection

warc_writer_prof
(interactive SVG.zip)

Tasks to improve the performance:

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions