How to convert to/from JSON#

Any JSON data can be converted to Awkward Arrays and any Awkward Arrays can be converted to JSON. Awkward type information, such as the distinction between fixed-size and variable-length lists, is lost in the transformation to JSON, however.

import awkward as ak
import pathlib

From JSON to Awkward#

The function for JSON → Awkward conversion is ak.from_json().

It can be given a JSON string:

ak.from_json("[[1.1, 2.2, 3.3], [], [4.4, 5.5]]")
[[1.1, 2.2, 3.3],
 [],
 [4.4, 5.5]]
-----------------------
backend: cpu
nbytes: 72 B
type: 3 * var * float64

or a file name:

!echo "[[1.1, 2.2, 3.3], [], [4.4, 5.5]]" > /tmp/awkward-example-1.json
ak.from_json(pathlib.Path("/tmp/awkward-example-1.json"))
[[1.1, 2.2, 3.3],
 [],
 [4.4, 5.5]]
-----------------------
backend: cpu
nbytes: 72 B
type: 3 * var * float64

If the dataset contains a single JSON object, an ak.Record is returned, rather than an ak.Array.

ak.from_json('{"x": 1, "y": [1, 2], "z": "hello"}')
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
File ~/micromamba/envs/awkward-docs/lib/python3.11/site-packages/IPython/core/formatters.py:1036, in MimeBundleFormatter.__call__(self, obj, include, exclude)
   1033     method = get_real_method(obj, self.print_method)
   1035     if method is not None:
-> 1036         return method(include=include, exclude=exclude)
   1037     return None
   1038 else:

File ~/micromamba/envs/awkward-docs/lib/python3.11/site-packages/awkward/highlevel.py:2415, in Record._repr_mimebundle_(self, include, exclude)
   2409 def _repr_mimebundle_(self, include=None, exclude=None):
   2410     # order:
   2411     # first: array,
   2412     # last: type,
   2413     # middle: rest sorted by length of prefix (longest first)
-> 2415     rows = highlevel_array_show_rows(
   2416         array=self,
   2417         type=True,
   2418         named_axis=True,
   2419         nbytes=True,
   2420         backend=True,
   2421     )
   2422     header_lines = rows.pop(0).removesuffix("\n").splitlines()
   2424     # it's always the second row (after the array)

File ~/micromamba/envs/awkward-docs/lib/python3.11/site-packages/awkward/prettyprint.py:490, in highlevel_array_show_rows(array, limit_rows, limit_cols, type, named_axis, nbytes, backend, formatter, precision)
    488     rows.append(named_axis_line)
    489 if nbytes:
--> 490     nbytes_line = f"nbytes: {bytes_repr(array.nbytes)}"
    491     rows.append(nbytes_line)
    492 if backend:

File ~/micromamba/envs/awkward-docs/lib/python3.11/site-packages/awkward/highlevel.py:2223, in Record.__getattr__(self, where)
   2195 """
   2196 Whenever possible, fields can be accessed as attributes.
   2197 
   (...)
   2220      keyword.
   2221 """
   2222 if hasattr(type(self), where):
-> 2223     return super().__getattribute__(where)
   2224 else:
   2225     if where in self._layout.fields:

File ~/micromamba/envs/awkward-docs/lib/python3.11/site-packages/awkward/highlevel.py:2035, in Record.nbytes(self)
   2024 @property
   2025 def nbytes(self):
   2026     """
   2027     The total number of bytes in all the #ak.index.Index,
   2028     and #ak.contents.NumpyArray buffers in this array tree.
   (...)
   2033     array buffers.
   2034     """
-> 2035     return self._layout.nbytes

AttributeError: 'Record' object has no attribute 'nbytes'
<Record {x: 1, y: [1, 2], z: 'hello'} type='{x: int64, y: var * int64, z: s...'>

From Awkward to JSON#

The function for Awkward → JSON conversion is ak.to_json().

With one argument, it returns a string.

ak.to_json(ak.Array([[1.1, 2.2, 3.3], [], [4.4, 5.5]]))
'[[1.1,2.2,3.3],[],[4.4,5.5]]'

But if a destination is given, it is taken to be a filename for output.

ak.to_json(ak.Array([[1.1, 2.2, 3.3], [], [4.4, 5.5]]), "/tmp/awkward-example-2.json")
!cat /tmp/awkward-example-2.json
[[1.1,2.2,3.3],[],[4.4,5.5]]

Conversion of different types#

All of the rules that apply for Python objects in ak.from_iter() and ak.to_list() apply to ak.from_json() and ak.to_json(), replacing builtin Python types for JSON types. (One exception: JSON has no equivalent of a Python tuple.)

Performance#

Since Awkward Array internally uses RapidJSON to simultaneously parse and convert the JSON string, ak.from_json() and ak.to_json() should always be faster and use less memory than ak.from_iter() and ak.to_list(). Don’t convert JSON strings into or out of Python objects for the sake of converting them as Python objects: use the JSON converters directly.