ak.from_arrow#
Defined in awkward.operations.ak_from_arrow on line 16.
- ak.from_arrow(array, *, generate_bitmasks=False, highlevel=True, behavior=None, attrs=None)#
- Parameters:
array (
pyarrow.Array,pyarrow.ChunkedArray,pyarrow.RecordBatch, orpyarrow.Table) – Apache Arrow array to convert into an Awkward Array.generate_bitmasks (bool) – If enabled and Arrow/Parquet does not have Awkward metadata,
generate_bitmasks=Truecreates empty bitmasks for nullable types that don’t have bitmasks in the Arrow/Parquet data, so that the Form (BitMaskedForm vs UnmaskedForm) is predictable.highlevel (bool) – If True, return an
ak.Array; otherwise, return a low-levelak.contents.Contentsubclass.behavior (None or dict) – Custom
ak.behaviorfor the output array, if high-level.attrs (None or dict) – Custom attributes for the output array, if high-level.
Converts an Apache Arrow array into an Awkward Array.
This function always preserves the values of a dataset; i.e. the Python objects
returned by ak.to_list are identical to the Python objects returned by Arrow’s
to_pylist method. If ak.to_arrow was invoked with extensionarray=True, this
function also preserves the data type (high-level ak.types.Type, though not the
low-level ak.forms.Form), even through Parquet, making Parquet a good way to save
Awkward Arrays for later use.
Because awkward uses numpy’s dtype system, timestamp types do not have timezones. If encountering timestamp types with timezones in the input arrow data, they will be silently dropped.
See also ak.to_arrow, ak.to_arrow_table, ak.from_parquet, ak.from_arrow_schema.