How to restructure arrays by adding fields#
import awkward as ak
import numpy as np
Adding fields to existing arrays#
Using array['x']
#
How to examine an array with simple slicing describes the wide variety of slice
types that can be used to pull values out of an Awkward Array. However, only single field-slicing is supported for assignment of new values.
array = ak.Array({"x": [1, 2, 3]})
array.show()
[{x: 1},
{x: 2},
{x: 3}]
To assign a new value to an existing array, we can simply use the subscript operator with the string name of the field. For example, to set the x
field, we can write
array["x"] = [-1, -2, 3]
array.show()
[{x: -1},
{x: -2},
{x: 3}]
This might seem strange, given that we describe Awkward Arrays as immutable. A more detailed explaination is given in the Advanced Users call-out, but it suffices to say that the fields of an array can be replaced, but individual values within an array cannot.
Advanced Users
An ak.Array
doesn’t itself contain any data; it wraps a low-level ak.contents.Content
object that defines the structure of the array. Assigning to a field just replaces the existing ak.contents.Content
with a new ak.contents.Content
. Therefore, the ak.contents.Content
objects are immutable, whilst ak.Array
is not.
Using this syntax, we can assign to a new field of an array:
array["y"] = [9, 8, 7]
array.show()
[{x: -1, y: 9},
{x: -2, y: 8},
{x: 3, y: 7}]
If necessary, the new field will be broadcasted to fit the array. For example, we can introduce a third field z
that is set to the constant 0
:
array["z"] = 0
array.show()
[{x: -1, y: 9, z: 0},
{x: -2, y: 8, z: 0},
{x: 3, y: 7, z: 0}]
A field can also be assigned deeply into a nested record e.g.
nested = ak.zip({"a": ak.zip({"x": [1, 2, 3]})})
nested["a", "y"] = 2 * nested.a.x
nested.show()
[{a: {x: 1, y: 2}},
{a: {x: 2, y: 4}},
{a: {x: 3, y: 6}}]
Note that the following does not work:
nested["a"]["y"] = 2 * nested.a.x # does not work, nested["a"] is a copy!
nested.show()
[{a: {x: 1, y: 2}},
{a: {x: 2, y: 4}},
{a: {x: 3, y: 6}}]
Why does this happen? Well, Python first evaluates nested["a"]
, which returns a new ak.Array
that is a (shallow) copy of the data in nested.a
. Hence, the next step — to set y
— operates on a different ak.Array
, and nested.a
remains unchanged. The Advanced Users call-out provides a more detailed explanation for why this does not work.
Using ak.with_field
#
Sometimes you might not want to modify an existing array, but rather produce a new array with the new field. Whilst this can be done using a shallow copy, e.g.
import copy
copied = copy.copy(nested)
copied["z"] = [10, 20, 30]
copied.show()
[{a: {x: 1, y: 2}, z: 10},
{a: {x: 2, y: 4}, z: 20},
{a: {x: 3, y: 6}, z: 30}]
nested.show()
[{a: {x: 1, y: 2}},
{a: {x: 2, y: 4}},
{a: {x: 3, y: 6}}]
Awkward provides a dedicated function ak.with_field()
that does this.
Note
Setting a field with array['x']
uses ak.with_field()
under the hood, so performance is not a factor in choosing one over the other.