ak.from_arrow#

Defined in awkward.operations.ak_from_arrow on line 16.

ak.from_arrow(array, *, generate_bitmasks=False, highlevel=True, behavior=None, attrs=None)#
Parameters:
  • array (pyarrow.Array, pyarrow.ChunkedArray, pyarrow.RecordBatch, or pyarrow.Table) – Apache Arrow array to convert into an Awkward Array.

  • generate_bitmasks (bool) – If enabled and Arrow/Parquet does not have Awkward metadata, generate_bitmasks=True creates empty bitmasks for nullable types that don’t have bitmasks in the Arrow/Parquet data, so that the Form (BitMaskedForm vs UnmaskedForm) is predictable.

  • highlevel (bool) – If True, return an ak.Array; otherwise, return a low-level ak.contents.Content subclass.

  • behavior (None or dict) – Custom ak.behavior for the output array, if high-level.

  • attrs (None or dict) – Custom attributes for the output array, if high-level.

Converts an Apache Arrow array into an Awkward Array.

This function always preserves the values of a dataset; i.e. the Python objects returned by ak.to_list are identical to the Python objects returned by Arrow’s to_pylist method. If ak.to_arrow was invoked with extensionarray=True, this function also preserves the data type (high-level ak.types.Type, though not the low-level ak.forms.Form), even through Parquet, making Parquet a good way to save Awkward Arrays for later use.

Because awkward uses numpy’s dtype system, timestamp types do not have timezones. If encountering timestamp types with timezones in the input arrow data, they will be silently dropped.

See also ak.to_arrow, ak.to_arrow_table, ak.from_parquet, ak.from_arrow_schema.