dataframe - Load dataset from "R" package using data(), assign it directly to a variable? -
how load dataset r package using data()
function, , assign directly variable without creating duplicate copy in environment?
put simply, can without creating 2 identical dfs in environment:
> data("faithful") # old faithful geyser data datasets package > x <- faithful > ls() # have 2 identical dfs - x , faithful - in environment [1] "faithful" "x" > remove(faithful) # i've removed 1 of redundant dfs
try 1:
my first approach assign data("faithful")
x
. data()
returns string. have df faithful
, character vector x
in environment.
> x <- data("faithful") > x [1] "faithful" # string, not df "faithful" datasets package > ls() [1] "faithful" "x"
try 2: tried little more sophisticated in second attempt.
> x <- get(data("faithful")) # works far assignment goes > ls() # still duplicate copy [1] "faithful" "x"
a short note motivation trying this. have r package 5 large data.frames - each having same columns. want efficiently generate same calculated columns on 5 data.frames. want use data()
within list()
constructor 5 data.frames list. want use llply()
, mutate()
plyr
package iterate on dfs in list , create calculated columns each df. don't want have duplicate copies of 5 large datasets sitting in environment within shiny app ram limit.
edit: able use both of @henfiber's methods answer figure out how lazy-load entire data.frames named list.
the first command here works assigning data.frame new variable name.
# loads faithful variable x. # note don't need use data() function load faithful > delayedassign("x",faithful)
but wanted create named list x
elements y = data(faithful)
, z=data(iris)
, etc.
i tried below , didn't work.
> x <- list(delayedassign("y",faithful),delayedassign("z", iris)) > ls() [1] "x" "y" "z" # x list 2 nulls, y & z promises faithful & iris
but able construct list of lazy-loaded data.frame objects in following manner:
# define function provided henfiber getdata <- function(...) { e <- new.env() name <- data(..., envir = e)[1] e[[name]] } # create list, gives 1 object "x" of class list # elements "y" , "z" data.frames x <- list(y=getdata(faithful),z=getdata(iris))
using helper function:
# define function getdata <- function(...) { e <- new.env() name <- data(..., envir = e)[1] e[[name]] } # load data calling getdata() x <- getdata("faithful")
or using anonymous function:
x <- (function(...)get(data(...,envir = new.env())))("faithful")
lazy evaluation
you should consider lazy loading
data adding lazydata: true
in description file of package.
if use rstudio
, after running data("faithful")
, you'll see @ environment
panel "faithful" data.frame called "promise"
(another less common name "thunk"
) , greyed out. means lazily evaluated r , not still loaded memory. can lazy load "x"
variable delayedassign()
function:
data("faithful") # lazy load "faithful" delayedassign("x", faithful) # lazy assign "x" reference "faithful" rm(faithful) # remove "faithful"
still nothing has been loaded memory yet
summary(x) # x has been loaded , evaluated
learn more lazy evaluation
here.
Comments
Post a Comment