This function will download the file if the file is remote and
unzip it if it is zipped. It will just return the input path argument if
it's neither.
If the zip contains multiple files, you can use filename_in_zip
to set the file you want to unzip and use.
You can pipe output on all *_to_parquet
functions.
Examples
# 1. unzip a local zip file
# 2. parquetize it
file_path <- download_extract(system.file("extdata","mtcars.csv.zip", package = "readr"))
csv_to_parquet(
file_path,
path_to_parquet = tempfile(fileext = ".parquet")
)
#> Reading data...
#> Writing data...
#> ✔ Data are available in parquet file under /tmp/Rtmp8qqupn/file178f65f8490.parquet
#> Writing data...
#> Reading data...
# 1. download a remote file
# 2. extract the file census2021-ts007-ctry.csv
# 3. parquetize it
file_path <- download_extract(
"https://www.nomisweb.co.uk/output/census/2021/census2021-ts007.zip",
filename_in_zip = "census2021-ts007-ctry.csv"
)
csv_to_parquet(
file_path,
path_to_parquet = tempfile(fileext = ".parquet")
)
#> Reading data...
#> Writing data...
#> ✔ Data are available in parquet file under /tmp/Rtmp8qqupn/file178f1b24ae3b.parquet
#> Writing data...
#> Reading data...
# the file is local and not zipped so :
# 1. parquetize it
file_path <- download_extract(parquetize_example("region_2022.csv"))
csv_to_parquet(
file_path,
path_to_parquet = tempfile(fileext = ".parquet")
)
#> Reading data...
#> Writing data...
#> ✔ Data are available in parquet file under /tmp/Rtmp8qqupn/file178f126d80d2.parquet
#> Writing data...
#> Reading data...