Skip to contents

Low level function that implements the logic to write a parquet file or a dataset from data

Usage

write_parquet_at_once(
  data,
  path_to_parquet,
  partition = "no",
  compression = "snappy",
  compression_level = NULL,
  ...
)

Arguments

data

the data.frame/tibble to write

path_to_parquet

String that indicates the path to the directory where the output parquet file or dataset will be stored.

partition

string ("yes" or "no" - by default) that indicates whether you want to create a partitioned parquet file. If "yes", `"partitioning"` argument must be filled in. In this case, a folder will be created for each modality of the variable filled in `"partitioning"`.

compression

compression algorithm. Default "snappy".

compression_level

compression level. Meaning depends on compression algorithm.

...

Additional format-specific arguments, see arrow::write_parquet()

Value

a dataset as return by arrow::open_dataset

Examples


write_parquet_at_once(iris, tempfile())
#> Writing data...
#>  Data are available in parquet file under /tmp/RtmpTd607Q/file1854333f1c05
#> Writing data...


write_parquet_at_once(iris, tempfile(), partition = "yes", partitioning = c("Species"))
#> Writing data...
#>  Data are available in parquet dataset under /tmp/RtmpTd607Q/file18545d8bcc8
#> Writing data...


if (FALSE) {
write_parquet_at_once(iris, tempfile(), compression="gzip", compression_level = 5)
}