ak.to_arrow#
Defined in awkward.operations.ak_to_arrow on line 15.
- ak.to_arrow(array, *, list_to32=False, string_to32=False, bytestring_to32=False, emptyarray_to=None, categorical_as_dictionary=False, extensionarray=True, count_nulls=True)#
- Parameters:
array – Array-like data (anything
ak.to_layoutrecognizes).list_to32 (bool) – If True, convert Awkward lists into 32-bit Arrow lists if they’re small enough, even if it means an extra conversion. Otherwise, signed 32-bit
ak.types.ListTypemaps to ArrowListType, signed 64-bitak.types.ListTypemaps to ArrowLargeListType, and unsigned 32-bitak.types.ListTypepicks whichever Arrow type its values fit into.string_to32 (bool) – Same as the above for Arrow
stringandlarge_string.bytestring_to32 (bool) – Same as the above for Arrow
binaryandlarge_binary.emptyarray_to (None or dtype) – If None,
ak.types.UnknownTypemaps to Arrow’s null type; otherwise, it is converted a given numeric dtype.categorical_as_dictionary (bool) – If True,
ak.contents.IndexedArrayandak.contents.IndexedOptionArraylabeled with__array__ = "categorical"are mapped to ArrowDictionaryArray; otherwise, the projection is evaluated before conversion (always the case without__array__ = "categorical").extensionarray (bool) – If True, this function returns extended Arrow arrays (at all levels of nesting), which preserve metadata so that Awkward → Arrow → Awkward preserves the array’s
ak.types.Type(though not theak.forms.Form). If False, this function returns generic Arrow arrays that might be needed for third-party tools that don’t recognize Arrow’s extensions. Even withextensionarray=False, the values produced by Arrow’sto_pylistmethod are the same as the values produced by Awkward’sak.to_list.count_nulls (bool) – If True, count the number of missing values at each level and include these in the resulting Arrow array, which makes some downstream applications faster. If False, skip the up-front cost of counting them.
Converts an Awkward Array into an Apache Arrow array.
This produces arrays of type pyarrow.Array. You might need to further
manipulations (using the pyarrow library) to build a pyarrow.ChunkedArray,
a pyarrow.RecordBatch, or a pyarrow.Table. For the latter, see ak.to_arrow_table.
This function always preserves the values of a dataset; i.e. the Python objects
returned by ak.to_list are identical to the Python objects returned by Arrow’s
to_pylist method. With extensionarray=True, this function also preserves the
data type (high-level ak.types.Type, though not the low-level ak.forms.Form),
even through Parquet, making Parquet a good way to save Awkward Arrays for later
use. If any third-party tools don’t recognize Arrow’s extension arrays, set this
option to False for plain Arrow arrays.
See also ak.from_arrow, ak.to_arrow_table, ak.to_parquet, ak.from_arrow_schema.