Skip to content

Commit 4c85457

Browse files
committed
Documentation improvements
1 parent e821aa2 commit 4c85457

1 file changed

Lines changed: 82 additions & 49 deletions

File tree

README.md

Lines changed: 82 additions & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -1,21 +1,33 @@
11
# orjson
22

3-
orjson is a fast, correct JSON library for Python. It benchmarks as the
4-
fastest Python library for JSON and is more correct than the standard json
5-
library or third-party libraries.
6-
7-
Its serialization performance is 2.5x to 9.5x the nearest
8-
other library and 4x to 12x the standard library. Its deserialization
9-
performance is 1.2x to 1.3x the nearest other library and 1.4x to 2x
10-
the standard library.
11-
12-
It differs in behavior from other Python JSON libraries in supporting
13-
datetimes, not supporting subclasses without a `default` hook,
14-
serializing UTF-8 to bytes rather than escaped ASCII (e.g., "好" rather than
15-
"\\\u597d") by default, having strict UTF-8 conformance, having strict JSON
16-
conformance on NaN/Infinity/-Infinity, having an option for strict
17-
JSON conformance on 53-bit integers, not supporting pretty
18-
printing, and not supporting all standard library options.
3+
orjson is a fast, correct JSON library for Python. It
4+
[benchmarks](#performance) as the fastest Python library for JSON and is
5+
more correct than the standard json library or third-party libraries.
6+
7+
Its serialization performance on fixtures of real data is 2.5x to 9.5x the
8+
nearest other library and 4x to 12x the standard library. Its deserialization
9+
performance on the same fixtures is 1.2x to 1.3x the nearest other
10+
library and 1.4x to 2x the standard library.
11+
12+
Its features and drawbacks compared to other Python JSON libraries:
13+
14+
* serializes `datetime`, `date`, and `time` instances to RFC 3339 format,
15+
a subset of ISO 8601
16+
* serializes to `bytes` rather than `str`
17+
* serializes `str` without escaping unicode to ASCII, e.g., "好" rather than
18+
"\\\u597d"
19+
* serializes `float` 10x faster and deserializes twice as fast as other
20+
libraries
21+
* serializes arbitrary types using a `default` hook
22+
* does not support subclasses, requiring use of `default`
23+
* has strict UTF-8 conformance, more correct than the standard library
24+
* has strict JSON conformance in not supporting Nan/Infinity/-Infinity
25+
* has an option for strict JSON conformance on 53-bit integers with default
26+
support for 64-bit
27+
* does not support pretty printing
28+
* does not support sorting `dict` by keys
29+
* does not provide `load()` or `dump()` functions for reading/writing from
30+
file-like objects
1931

2032
orjson supports CPython 3.5, 3.6, 3.7, and 3.8. It distributes wheels for Linux,
2133
macOS, and Windows. The manylinux1 wheel differs from PEP 513 in requiring
@@ -26,6 +38,21 @@ repository and issue tracker is
2638
[github.com/ijl/orjson](https://github.com/ijl/orjson), and patches may be
2739
submitted there.
2840

41+
1. [Usage](#usage)
42+
1. [Install](#install)
43+
2. [Serialize](#serialize)
44+
3. [Deserialize](#deserialize)
45+
2. [Types](#types)
46+
1. [datetime](#datetime)
47+
2. [int](#int)
48+
3. [float](#float)
49+
4. [str](#str)
50+
3. [Testing](#testing)
51+
4. [Performance](#performance)
52+
1. [Latency](#latency)
53+
2. [Memory](#memory)
54+
3. [Reproducing](#reproducing)
55+
2956
## Usage
3057

3158
### Install
@@ -61,9 +88,33 @@ It natively serializes
6188
`typing.TypedDict`, `datetime.datetime`,
6289
`datetime.date`, `datetime.time`, and `None` instances. It supports
6390
arbitrary types through `default`. It does not serialize subclasses of
64-
supported types natively, but `default` may be used.
91+
supported types natively.
6592

66-
It accepts options via an `option` keyword argument. These include:
93+
To serialize a subclass or arbitrary types, specify `default` as a
94+
callable that returns a supported type. `default` may be a function,
95+
lambda, or callable class instance.
96+
97+
```python
98+
>>> import orjson, numpy
99+
>>>
100+
def default(obj):
101+
if isinstance(obj, numpy.ndarray):
102+
return obj.tolist()
103+
>>> orjson.dumps(numpy.random.rand(2, 2), default=default)
104+
b'[[0.08423896597867486,0.854121264944197],[0.8452845446981371,0.19227780743524303]]'
105+
```
106+
107+
If the `default` callable does not return an object, and an exception
108+
was raised within the `default` function, an exception describing this is
109+
raised. If no object is returned by the `default` callable but also
110+
no exception was raised, it falls through to raising `JSONEncodeError` on an
111+
unsupported type.
112+
113+
The `default` callable may return an object that itself
114+
must be handled by `default` up to five levels deep before an exception
115+
is raised.
116+
117+
`dumps()` accepts options via an `option` keyword argument. These include:
67118

68119
- `orjson.OPT_STRICT_INTEGER` for enforcing a 53-bit limit on integers. The
69120
limit is otherwise 64 bits, the same as the Python standard library.
@@ -93,28 +144,6 @@ It raises `JSONEncodeError` if a `tzinfo` on a datetime object is incorrect.
93144
`JSONEncodeError` is a subclass of `TypeError`. This is for compatibility
94145
with the standard library.
95146

96-
To serialize arbitrary types, specify `default` as a callable that returns
97-
a supported type. `default` may be a function, lambda, or callable class
98-
instance.
99-
100-
```python
101-
>>> import orjson, numpy
102-
>>> def default(obj):
103-
if isinstance(obj, numpy.ndarray):
104-
return obj.tolist()
105-
>>> orjson.dumps(numpy.random.rand(2, 2), default=default)
106-
b'[[0.08423896597867486,0.854121264944197],[0.8452845446981371,0.19227780743524303]]'
107-
```
108-
109-
If the `default` callable does not return an object, and an exception
110-
was raised within the `default` function, an exception describing this is
111-
raised. If no object is returned by the `default` callable but also
112-
no exception was raised, it falls through to raising `JSONEncodeError` on an
113-
unsupported type.
114-
115-
The `default` callable may return an object that itself
116-
must be handled by `default` up to five levels deep before an exception
117-
is raised.
118147

119148
### Deserialize
120149

@@ -140,6 +169,8 @@ which the standard library allows, but is not valid JSON.
140169
`JSONDecodeError` is a subclass of `json.JSONDecodeError` and `ValueError`.
141170
This is for compatibility with the standard library.
142171

172+
## Types
173+
143174
### datetime
144175

145176
orjson serializes `datetime.datetime` objects to
@@ -241,7 +272,7 @@ i.e., it modifies the data.
241272
compliant JSON, as `null`:
242273

243274
```python
244-
>>> import orjson
275+
>>> import orjson, ujson, rapidjson, json
245276
>>> orjson.dumps([float("NaN"), float("Infinity"), float("-Infinity")])
246277
b'[null,null,null]'
247278
>>> ujson.dumps([float("NaN"), float("Infinity"), float("-Infinity")])
@@ -252,12 +283,18 @@ OverflowError: Invalid Inf value when encoding double
252283
'[NaN, Infinity, -Infinity]'
253284
```
254285

255-
### UTF-8
286+
### str
256287

257-
orjson raises an exception on invalid UTF-8. This is
258-
necessary because Python 3 `str` objects may contain UTF-16 surrogates.
288+
orjson is strict about UTF-8 conformance. This is stricter than the standard
289+
library's json module, which will serialize and deserialize UTF-16 surrogates,
290+
e.g., "\ud800", that are invalid UTF-8.
259291

260-
The standard library's json module deserializes and serializes invalid UTF-8.
292+
If `orjson.dumps()` is given a `str` that does not contain valid UTF-8,
293+
`orjson.JSONEncodeError` is raised. If `loads()` receives invalid UTF-8,
294+
`orjson.JSONDecodeError` is raised.
295+
296+
orjson and rapidjson are the only compared JSON libraries to consistently
297+
error on bad input.
261298

262299
```python
263300
>>> import orjson, ujson, rapidjson, json
@@ -269,10 +306,6 @@ UnicodeEncodeError: 'utf-8' codec ...
269306
UnicodeEncodeError: 'utf-8' codec ...
270307
>>> json.dumps('\ud800')
271308
'"\\ud800"'
272-
```
273-
274-
```python
275-
>>> import orjson, ujson, rapidjson, json
276309
>>> orjson.loads('"\\ud800"')
277310
JSONDecodeError: unexpected end of hex escape at line 1 column 8: line 1 column 1 (char 0)
278311
>>> ujson.loads('"\\ud800"')

0 commit comments

Comments
 (0)