Skip to content

Commit a19c96e

Browse files
committed
Add failing test for unicode decoding
Exposes a decoding issue wherever multi-byte characters span across a chunk size boundary (#30).
1 parent 7db6dc5 commit a19c96e

File tree

1 file changed

+10
-0
lines changed

1 file changed

+10
-0
lines changed

test/byte_streams_test.clj

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@
1010
[java.nio.charset
1111
Charset]
1212
[java.io
13+
ByteArrayInputStream
1314
File]
1415
[java.nio
1516
ByteBuffer]
@@ -129,3 +130,12 @@
129130
to-input-stream
130131
(convert (seq-of ByteBuffer) {:chunk-size 1})
131132
to-byte-array)))))
133+
134+
(deftest test-unicode-decoding
135+
(let [three-byte-char ""
136+
text (apply str (repeat 10000 three-byte-char))
137+
text-bytes (.getBytes text "utf-8")]
138+
(is (bytes= text-bytes
139+
(.getBytes (convert (ByteArrayInputStream. text-bytes) String) "utf-8")))
140+
(is (bytes= text-bytes
141+
(.getBytes (convert (ByteArrayInputStream. text-bytes) String {:chunk-size 100}) "utf-8")))))

0 commit comments

Comments
 (0)