Looking at a heritrix request with tcpdump, you can see that a separate tcp packet is sent for each of the characters 'G' 'E' 'T' ' ' '/' ' ' 'H' 'T' 'T'... at the beginning of each http request. We ran into a website that responds with a 400 to these requests. Besides, it's inefficient and ugly to be sending all these little packets. The culprit is a shortcut in my own code in RecordingOutputStream.
Looking at a heritrix request with tcpdump, you can see that a separate tcp packet is sent for each of the characters 'G' 'E' 'T' ' ' '/' ' ' 'H' 'T' 'T'... at the beginning of each http request. We ran into a website that responds with a 400 to these requests. Besides, it's inefficient and ugly to be sending all these little packets. The culprit is a shortcut in my own code in RecordingOutputStream.