ak.from_arrow#
Defined in awkward.operations.ak_from_arrow on line 16.
- ak.from_arrow(array, *, generate_bitmasks=False, highlevel=True, behavior=None, attrs=None)#
- Parameters:
array (
pyarrow.Array
,pyarrow.ChunkedArray
,pyarrow.RecordBatch
, orpyarrow.Table
) – Apache Arrow array to convert into an Awkward Array.generate_bitmasks (bool) – If enabled and Arrow/Parquet does not have Awkward metadata,
generate_bitmasks=True
creates empty bitmasks for nullable types that don’t have bitmasks in the Arrow/Parquet data, so that the Form (BitMaskedForm vs UnmaskedForm) is predictable.highlevel (bool) – If True, return an
ak.Array
; otherwise, return a low-levelak.contents.Content
subclass.behavior (None or dict) – Custom
ak.behavior
for the output array, if high-level.attrs (None or dict) – Custom attributes for the output array, if high-level.
Converts an Apache Arrow array into an Awkward Array.
This function always preserves the values of a dataset; i.e. the Python objects
returned by ak.to_list
are identical to the Python objects returned by Arrow’s
to_pylist
method. If ak.to_arrow
was invoked with extensionarray=True
, this
function also preserves the data type (high-level ak.types.Type
, though not the
low-level ak.forms.Form
), even through Parquet, making Parquet a good way to save
Awkward Arrays for later use.
Because awkward uses numpy’s dtype system, timestamp types do not have timezones. If encountering timestamp types with timezones in the input arrow data, they will be silently dropped.
See also ak.to_arrow
, ak.to_arrow_table
, ak.from_parquet
, ak.from_arrow_schema
.