Vroom

Supercharge data import in R

I’m very excited the learn about vroom, RStudio’s latest tidyverse offering. It imports data a lot faster compared with existing R solutions.

Check out the following benchmark that provides a comparison across a handful of similar functions and interactions between various libraries.

Benchmark

Benchmark

The speed is already a game-changer, but the following features sweeten the deal:

  • Similar to readr
    vroom shares many features with readr, including nearly all of the parsing features of readr for delimited and fixed width files.

  • Reading multiple files
    Native support reading from multiple files and connections. It reads sets of files with the same columns into one table.

  • Delimited files
    Automatically guesses the delimiter of a file.

  • Compressed files
    Automatically reads and writes zip, gzip, bz2 and xz compressed files with the standard file extensions.

  • Remote files
    Read files from the internet by passing the URL of the file to vroom().

  • Reading and writing from pipe connections
    Provides efficient input and output from pipe() connections, which is useful for pre-filtering large inputs for example.

  • Column selection
    The col_select feature makes it easy to select columns to retain or omit. It supports selection helpers and renaming too, including helper functions to repair names.

It significantly speeds up workflow, making it my default tool for importing files into R. You can find the original article here.


comments powered by Disqus