Commit b3b78bb
committed
WarcCdxWriter: extraction of redirect targets for CDX should not be case-sensitive (#18)
- make extraction of HTTP headers not depend on correct casing for:
- "Location" and "Content-Type" (WarcCdxWriter: "redirect" and "mime")
- "Content-Type" (support for language detector)
- refactor: header names as constants1 parent 5b73e16 commit b3b78bb
4 files changed
Lines changed: 34 additions & 20 deletions
File tree
- src/java/org/commoncrawl/util
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
97 | 97 | | |
98 | 98 | | |
99 | 99 | | |
100 | | - | |
| 100 | + | |
| 101 | + | |
101 | 102 | | |
102 | 103 | | |
103 | 104 | | |
| |||
107 | 108 | | |
108 | 109 | | |
109 | 110 | | |
110 | | - | |
| 111 | + | |
111 | 112 | | |
112 | 113 | | |
113 | 114 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
21 | | - | |
22 | 21 | | |
23 | 22 | | |
24 | | - | |
25 | 23 | | |
26 | 24 | | |
27 | 25 | | |
| |||
34 | 32 | | |
35 | 33 | | |
36 | 34 | | |
| 35 | + | |
37 | 36 | | |
38 | 37 | | |
39 | 38 | | |
| |||
120 | 119 | | |
121 | 120 | | |
122 | 121 | | |
123 | | - | |
124 | | - | |
125 | | - | |
126 | | - | |
127 | | - | |
128 | | - | |
129 | | - | |
130 | | - | |
131 | | - | |
| 122 | + | |
132 | 123 | | |
133 | 124 | | |
134 | 125 | | |
| |||
162 | 153 | | |
163 | 154 | | |
164 | 155 | | |
165 | | - | |
| 156 | + | |
166 | 157 | | |
167 | 158 | | |
168 | | - | |
| 159 | + | |
169 | 160 | | |
170 | 161 | | |
171 | 162 | | |
172 | 163 | | |
173 | 164 | | |
174 | 165 | | |
175 | | - | |
| 166 | + | |
176 | 167 | | |
177 | 168 | | |
178 | 169 | | |
179 | | - | |
| 170 | + | |
180 | 171 | | |
181 | 172 | | |
182 | 173 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
637 | 637 | | |
638 | 638 | | |
639 | 639 | | |
640 | | - | |
| 640 | + | |
641 | 641 | | |
642 | 642 | | |
643 | 643 | | |
| |||
660 | 660 | | |
661 | 661 | | |
662 | 662 | | |
663 | | - | |
| 663 | + | |
664 | 664 | | |
665 | 665 | | |
666 | 666 | | |
667 | 667 | | |
668 | 668 | | |
669 | 669 | | |
670 | | - | |
| 670 | + | |
| 671 | + | |
671 | 672 | | |
672 | 673 | | |
673 | 674 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
32 | 32 | | |
33 | 33 | | |
34 | 34 | | |
| 35 | + | |
35 | 36 | | |
36 | 37 | | |
37 | 38 | | |
| |||
74 | 75 | | |
75 | 76 | | |
76 | 77 | | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
77 | 83 | | |
78 | 84 | | |
79 | 85 | | |
| |||
374 | 380 | | |
375 | 381 | | |
376 | 382 | | |
| 383 | + | |
| 384 | + | |
| 385 | + | |
| 386 | + | |
| 387 | + | |
| 388 | + | |
| 389 | + | |
| 390 | + | |
| 391 | + | |
| 392 | + | |
| 393 | + | |
| 394 | + | |
| 395 | + | |
| 396 | + | |
| 397 | + | |
377 | 398 | | |
0 commit comments