ak.sum#
Defined in awkward.operations.ak_sum on line 24.
- ak.sum(array, axis=None, *, keepdims=False, mask_identity=False, highlevel=True, behavior=None, attrs=None)#
- Parameters:
array – Array-like data (anything
ak.to_layout
recognizes).axis (None or int) – If None, combine all values from the array into a single scalar result; if an int, group by that axis:
0
is the outermost,1
is the first level of nested lists, etc., and negativeaxis
counts from the innermost:-1
is the innermost,-2
is the next level up, etc.keepdims (bool) – If False, this reducer decreases the number of dimensions by 1; if True, the reduced values are wrapped in a new length-1 dimension so that the result of this operation may be broadcasted with the original array.
mask_identity (bool) – If True, reducing over empty lists results in None (an option type); otherwise, reducing over empty lists results in the operation’s identity.
highlevel (bool) – If True, return an
ak.Array
; otherwise, return a low-levelak.contents.Content
subclass.behavior (None or dict) – Custom
ak.behavior
for the output array, if high-level.attrs (None or dict) – Custom attributes for the output array, if high-level.
Sums over array
(many types supported, including all Awkward Arrays
and Records). The identity of addition is 0
and it is usually not
masked. This operation is the same as NumPy’s
sum
if all lists at a given dimension have the same length and no None values,
but it generalizes to cases where they do not.
For example, consider this array
, in which all lists at a given dimension
have the same length.
>>> array = ak.Array([[ 0.1, 0.2, 0.3],
... [10.1, 10.2, 10.3],
... [20.1, 20.2, 20.3],
... [30.1, 30.2, 30.3]])
A sum over axis=-1
combines the inner lists, leaving one value per
outer list:
>>> ak.sum(array, axis=-1)
<Array [0.6, 30.6, 60.6, 90.6] type='4 * float64'>
while a sum over axis=0
combines the outer lists, leaving one value
per inner list:
>>> ak.sum(array, axis=0)
<Array [60.4, 60.8, 61.2] type='3 * float64'>
Now with some values missing,
>>> array = ak.Array([[ 0.1, 0.2 ],
... [10.1 ],
... [20.1, 20.2, 20.3],
... [30.1, 30.2 ]])
The sum over axis=-1
results in
>>> ak.sum(array, axis=-1)
<Array [0.3, 10.1, 60.6, 60.3] type='4 * float64'>
and the sum over axis=0
results in
>>> ak.sum(array, axis=0)
<Array [60.4, 50.6, 20.3] type='3 * float64'>
How we ought to sum over the innermost lists is unambiguous, but for all
other axis
values, we must choose whether to align contents to the
left before summing, to the right before summing, or something else.
As suggested by the way the text has been aligned, we choose the
left-alignment convention: the first axis=0
result is the sum of all
first elements
60.4 = 0.1 + 10.1 + 20.1 + 30.1
the second is the sum of all second elements
50.6 = 0.2 + 20.2 + 30.2
and the third is the sum of the only third element
20.3 = 20.3
The same is true if the values were None, rather than gaps:
>>> array = ak.Array([[ 0.1, 0.2, None],
... [10.1, None, None],
... [20.1, 20.2, 20.3],
... [30.1, 30.2, None]])
>>> ak.sum(array, axis=-1)
<Array [0.3, 10.1, 60.6, 60.3] type='4 * float64'>
>>> ak.sum(array, axis=0)
<Array [60.4, 50.6, 20.3] type='3 * float64'>
However, the missing value placeholder, None, allows us to align the remaining data differently:
>>> array = ak.Array([[None, 0.1, 0.2],
... [None, None, 10.1],
... [20.1, 20.2, 20.3],
... [None, 30.1, 30.2]])
Now the axis=-1
result is the same but the axis=0
result has changed:
>>> ak.sum(array, axis=-1)
<Array [0.3, 10.1, 60.6, 60.3] type='4 * float64'>
>>> ak.sum(array, axis=0)
<Array [20.1, 50.4, 60.8] type='3 * float64'>
because
20.1 = 20.1
50.4 = 0.1 + 20.2 + 30.1
60.8 = 0.2 + 10.1 + 20.3 + 30.2
If, instead of missing numbers, we had missing lists,
>>> array = ak.Array([[ 0.1, 0.2, 0.3],
... None,
... [20.1, 20.2, 20.3],
... [30.1, 30.2, 30.3]])
then the placeholder would pass through the axis=-1
sum because summing
over the inner dimension shouldn’t change the length of the outer
dimension.
>>> ak.sum(array, axis=-1)
<Array [0.6, None, 60.6, 90.6] type='4 * ?float64'>
However, the axis=0
sum loses information about the None value.
>>> ak.sum(array, axis=0)
<Array [50.3, 50.6, 50.9] type='3 * float64'>
which is
50.3 = 0.1 + (None) + 20.1 + 30.1
50.6 = 0.2 + (None) + 20.2 + 30.2
50.9 = 0.3 + (None) + 20.3 + 30.3
An axis=0
sum would be reducing that information if it had not been
None, anyway. If the None values were replaced with 0
, the result for
axis=0
would be the same. The result for axis=-1
would not be the
same because this None is in the 0
axis, not the axis that axis=-1
sums over.
The keepdims
parameter ensures that the number of dimensions does not
change: scalar results are put into new length-1 dimensions:
>>> ak.sum(array, axis=-1, keepdims=True)
<Array [[0.6], None, [60.6], [90.6]] type='4 * option[1 * float64]'>
>>> ak.sum(array, axis=0, keepdims=True)
<Array [[50.3, 50.6, 50.9]] type='1 * var * float64'>
and axis=None
ignores all None values and adds up everything in the
array (keepdims
has no effect).
>>> ak.sum(array, axis=None)
151.8
The mask_identity
, which has no equivalent in NumPy, inserts None in
the output wherever a reduction takes place over zero elements. This is
different from reductions that are otherwise equal to the identity or
are equal to the identity by cancellation.
>>> array = ak.Array([[2.2, 2.2], [4.4, -2.2, -2.2], [], [0.0]])
>>> ak.sum(array, axis=-1)
<Array [4.4, 0, 0, 0] type='4 * float64'>
>>> ak.sum(array, axis=-1, mask_identity=True)
<Array [4.4, 0, None, 0] type='4 * ?float64'>
The third list is reduced to 0
if mask_identity=False
because 0
is
the identity of addition, but it is reduced to None if
mask_identity=True
.
See also ak.nansum
.