dataframe - Load dataset from "R" package using data(), assign it directly to a variable? -
how load dataset r package using data() function, , assign directly variable without creating duplicate copy in environment?
put simply, can without creating 2 identical dfs in environment:
> data("faithful") # old faithful geyser data datasets package > x <- faithful > ls() # have 2 identical dfs - x , faithful - in environment [1] "faithful" "x" > remove(faithful) # i've removed 1 of redundant dfs try 1:
my first approach assign data("faithful") x. data() returns string. have df faithful , character vector x in environment.
> x <- data("faithful") > x [1] "faithful" # string, not df "faithful" datasets package > ls() [1] "faithful" "x" try 2: tried little more sophisticated in second attempt.
> x <- get(data("faithful")) # works far assignment goes > ls() # still duplicate copy [1] "faithful" "x" a short note motivation trying this. have r package 5 large data.frames - each having same columns. want efficiently generate same calculated columns on 5 data.frames. want use data() within list() constructor 5 data.frames list. want use llply() , mutate() plyr package iterate on dfs in list , create calculated columns each df. don't want have duplicate copies of 5 large datasets sitting in environment within shiny app ram limit.
edit: able use both of @henfiber's methods answer figure out how lazy-load entire data.frames named list.
the first command here works assigning data.frame new variable name.
# loads faithful variable x. # note don't need use data() function load faithful > delayedassign("x",faithful) but wanted create named list x elements y = data(faithful), z=data(iris), etc.
i tried below , didn't work.
> x <- list(delayedassign("y",faithful),delayedassign("z", iris)) > ls() [1] "x" "y" "z" # x list 2 nulls, y & z promises faithful & iris but able construct list of lazy-loaded data.frame objects in following manner:
# define function provided henfiber getdata <- function(...) { e <- new.env() name <- data(..., envir = e)[1] e[[name]] } # create list, gives 1 object "x" of class list # elements "y" , "z" data.frames x <- list(y=getdata(faithful),z=getdata(iris))
using helper function:
# define function getdata <- function(...) { e <- new.env() name <- data(..., envir = e)[1] e[[name]] } # load data calling getdata() x <- getdata("faithful") or using anonymous function:
x <- (function(...)get(data(...,envir = new.env())))("faithful") lazy evaluation
you should consider lazy loading data adding lazydata: true in description file of package.
if use rstudio, after running data("faithful"), you'll see @ environment panel "faithful" data.frame called "promise" (another less common name "thunk") , greyed out. means lazily evaluated r , not still loaded memory. can lazy load "x" variable delayedassign() function:
data("faithful") # lazy load "faithful" delayedassign("x", faithful) # lazy assign "x" reference "faithful" rm(faithful) # remove "faithful" still nothing has been loaded memory yet
summary(x) # x has been loaded , evaluated learn more lazy evaluation here.
Comments
Post a Comment