How to list an array’s fields/columns/keys#
Show code cell content
%config InteractiveShell.ast_node_interactivity = "last_expr_or_assign"
Arrays of records#
As seen in How to create arrays of records, one of Awkward Array’s most useful features is the ability to compose separate arrays into a single record structure:
import awkward as ak
import numpy as np
records = ak.Array(
[
{"x": 0.014309631995020777, "y": 0.7077380205549498},
{"x": 0.44925764718311145, "y": 0.11927022136408238},
{"x": 0.9870653236436898, "y": 0.1543661194285082},
{"x": 0.7071893130949595, "y": 0.3966721033002645},
{"x": 0.3059032831996634, "y": 0.5094743992919755},
]
)
[{x: 0.0143, y: 0.708}, {x: 0.449, y: 0.119}, {x: 0.987, y: 0.154}, {x: 0.707, y: 0.397}, {x: 0.306, y: 0.509}] -------------------------------------------- backend: cpu nbytes: 80 B type: 5 * { x: float64, y: float64 }
The type of an array gives an indication of the fields that it contains. We can see that the records
array contains two fields "x"
and "y"
:
print(records.type)
5 * {x: float64, y: float64}
records.type.show()
5 * {
x: float64,
y: float64
}
The ak.Array
object itself provides a convenient ak.Array.fields
property that returns the list of field names
records.fields
['x', 'y']
In addition to this, Awkward Array also provides a high-level ak.fields()
function that returns the same result
ak.fields(records)
['x', 'y']
Arrays of tuples#
In addition to records, Awkward Array also has the concept of tuples.
tuples = ak.Array(
[
(1, 2, 3),
(1, 2, 3),
]
)
[(1, 2, 3), (1, 2, 3)] --------------------------------------------- backend: cpu nbytes: 48 B type: 2 * ( int64, int64, int64 )
These look very similar to records, but the fields are un-named:
print(tuples.type)
2 * (int64, int64, int64)
Despite this, the ak.fields()
function, and ak.Array.fields
property both return non-empty lists of strings when used to query a tuple array:
ak.fields(tuples)
['0', '1', '2']
tuples.fields
['0', '1', '2']
The returned field names are string-quoted integers ("0"
, "1"
, …) that refer to zero-indexed tuple slots, and can be used to project the array:
tuples["0"]
[1, 1] --------------- backend: cpu nbytes: 16 B type: 2 * int64
tuples["1"]
[2, 2] --------------- backend: cpu nbytes: 16 B type: 2 * int64
Whilst the fields of records can be accessed as attributes of the array:
records.x
[0.0143, 0.449, 0.987, 0.707, 0.306] ----------------- backend: cpu nbytes: 40 B type: 5 * float64
The same is not true of tuples, because integers are not valid attribute names:
tuples.0
Cell In[14], line 1
tuples.0
^
SyntaxError: invalid syntax
The close similarity between records and tuples naturally raises the question:
How do I know whether an array contains records or tuples?
The ak.is_tuple()
function can be used to differentiate between the two
ak.is_tuple(tuples)
True
ak.is_tuple(records)
False