ak.metadata_from_parquet#
Defined in awkward.operations.ak_metadata_from_parquet on line 22.
- ak.metadata_from_parquet(path, *, storage_options=None, row_groups=None, ignore_metadata=False, scan_files=True)#
- Parameters:
path (str) – Local filename or remote URL, passed to fsspec for resolution. May contain glob patterns. A list of paths is also allowed, but they must be data files, not directories.
storage_options – Passed to
fsspec.parquet.open_parquet_file
.row_groups (None or set of int) – Row groups to read; must be non-negative. Order is ignored: the output array is presented in the order specified by Parquet metadata. If None, all row groups/all rows are read.
ignore_metadata (bool) – ignore the dedicated _metadata file if found and instead derive metadata from the first data file.
scan_files (bool) – TODO
This function differs from ak.from_parquet._metadata as follows:
this function will always use a _metadata file, if present
if there is no _metadata, the schema comes from _common_metadata or the first data file
the total number of rows is always known
Returns dict containing
form
: an Awkward Form representing the low-level type of the data (use.type
to get a high-level type),fs
: the fsspec filesystem object,paths
: a list of matching path names,col_counts
: the number of rows in each row group,columns
: the columns defined by the schema,num_rows
: the length of the array that would be read byak.from_parquet
,num_row_groups
: the units that can be filtered (for theak.from_parquet
row_groups
argument).
See also ak.from_parquet
, ak.to_parquet
.