Skip to content

Pull requests: commoncrawl/nutch

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Integrate Apache Nutch upstream improvements
#55 by sebastian-nagel was merged May 18, 2026 Loading…
Add end to end tests for the SitemapInjector
#52 by lfoppiano was merged May 22, 2026 Loading…
Enable report for running tests
#45 by lfoppiano was closed Feb 25, 2026 Loading…
Fix revisit content-type
#42 by lfoppiano was merged Mar 28, 2026 Loading…
Detect canonical links in Fetcher
#36 by sebastian-nagel was merged Dec 19, 2025 Loading…
Add Github workflow to build the branch 'cc'
#31 by sebastian-nagel was merged Nov 21, 2024 Loading…
WARC writer support HTTP/2
#30 by sebastian-nagel was merged Jul 27, 2024 Loading…
Generator2: improvements and fixes
#28 by sebastian-nagel was merged May 6, 2024 Loading…
ProTip! Type g p on any issue or pull request to go back to the pull request listing page.