Expand description
APIs to read from Parquet format.
Re-exports
pub use parquet2::fallible_streaming_iterator;
pub use schema::infer_schema;
Modules
APIs to handle Parquet <-> Arrow schemas.
APIs exposing parquet2
’s statistics as arrow’s statistics.
Structs
A FallibleStreamingIterator
that decompresses CompressedDataPage
into DataPage
.
Metadata for a column chunk.
A descriptor for leaf-level primitive columns. This encapsulates information such as definition and repetition levels and is used to re-assemble nested data.
A CompressedDataPage
is compressed, encoded representation of a Parquet data page.
It holds actual data and thus cloning it is expensive.
A DataPage
is an uncompressed, encoded representation of a Parquet data page. It holds actual data
and thus cloning it is expensive.
Decompressor that allows re-using the page buffer of PageIterator
.
Metadata for a Parquet file.
An iterator of Chunk
s coming from row groups of a parquet file.
A page iterator iterates over row group’s pages. In parquet, pages are guaranteed to be contiguously arranged in memory and therefore must be read in sequence.
A MutStreamingIterator
of pre-read column chunks
Metadata for a row group.
An [Iterator<Item=RowGroupDeserializer>
] from row groups of a parquet file.
Timestamp logical type annotation
Enums
Errors generated by this crate
Representation of a Parquet type.
Used to describe primitive leaf fields and structs, including top-level schema.
Note that the top-level schema type is represented using GroupType
whose
repetition is None
.
State of MutStreamingIterator
.
Traits
Trait describing a MutStreamingIterator
of column chunks.
Trait describing a FallibleStreamingIterator
of DataPage
A fallible, streaming iterator.
Functions
Returns a new PageIterator
by seeking reader
to the begining of column_chunk
.
Returns a stream of compressed data pages
Reads a file’s metadata.
Decompresses the page, using buffer
for decompression.
If page.buffer.len() == 0
, there was no decompression and the buffer was moved.
Else, decompression took place.
Returns a ColumnIterator
of column chunks corresponding to field
.
Contrarily to get_page_iterator
that returns a single iterator of pages, this iterator
returns multiple iterators, one per physical column of the field
.
For primitive fields (e.g. i64
), ColumnIterator
yields exactly one column.
For complex fields, it yields multiple columns.
Creates a new iterator of compressed pages.
Reads all columns that are part of the parquet field field_name
Reads all columns that are part of the parquet field field_name
Returns a vector of iterators of Array
corresponding to the top level parquet fields whose
name matches fields
’s names.
Reads parquets’ metadata syncronously.
Reads parquets’ metadata asynchronously.
Type Definitions
Type declaration for a page filter