How to list an array’s fields/columns/keys#
Arrays of records#
As seen in How to create arrays of records, one of Awkward Array’s most useful features is the ability to compose separate arrays into a single record structure:
import awkward as ak
import numpy as np
records = ak.Array(
[
{"x": 0.014309631995020777, "y": 0.7077380205549498},
{"x": 0.44925764718311145, "y": 0.11927022136408238},
{"x": 0.9870653236436898, "y": 0.1543661194285082},
{"x": 0.7071893130949595, "y": 0.3966721033002645},
{"x": 0.3059032831996634, "y": 0.5094743992919755},
]
)
[{x: 0.0143, y: 0.708},
{x: 0.449, y: 0.119},
{x: 0.987, y: 0.154},
{x: 0.707, y: 0.397},
{x: 0.306, y: 0.509}]
-----------------------
backend: cpu
nbytes: 80 B
type: 5 * {
x: float64,
y: float64
}The type of an array gives an indication of the fields that it contains. We can see that the records array contains two fields "x" and "y":
print(records.type)
5 * {x: float64, y: float64}
records.type.show()
5 * {
x: float64,
y: float64
}
The ak.Array object itself provides a convenient ak.Array.fields property that returns the list of field names
records.fields
['x', 'y']
In addition to this, Awkward Array also provides a high-level ak.fields() function that returns the same result
ak.fields(records)
['x', 'y']
Arrays of tuples#
In addition to records, Awkward Array also has the concept of tuples.
tuples = ak.Array(
[
(1, 2, 3),
(1, 2, 3),
]
)
[(1, 2, 3),
(1, 2, 3)]
-----------
backend: cpu
nbytes: 48 B
type: 2 * (
int64,
int64,
int64
)These look very similar to records, but the fields are un-named:
print(tuples.type)
2 * (int64, int64, int64)
Despite this, the ak.fields() function, and ak.Array.fields property both return non-empty lists of strings when used to query a tuple array:
ak.fields(tuples)
['0', '1', '2']
tuples.fields
['0', '1', '2']
The returned field names are string-quoted integers ("0", "1", …) that refer to zero-indexed tuple slots, and can be used to project the array:
tuples["0"]
[1, 1] --- backend: cpu nbytes: 16 B type: 2 * int64
tuples["1"]
[2, 2] --- backend: cpu nbytes: 16 B type: 2 * int64
Whilst the fields of records can be accessed as attributes of the array:
records.x
[0.0143, 0.449, 0.987, 0.707, 0.306] -------- backend: cpu nbytes: 40 B type: 5 * float64
The same is not true of tuples, because integers are not valid attribute names:
tuples.0
Cell In[14], line 1
tuples.0
^
SyntaxError: invalid syntax
The close similarity between records and tuples naturally raises the question:
How do I know whether an array contains records or tuples?
The ak.is_tuple() function can be used to differentiate between the two
ak.is_tuple(tuples)
True
ak.is_tuple(records)
False