dataframe - Load dataset from "R" package using data(), assign it directly to a variable? -


how load dataset r package using data() function, , assign directly variable without creating duplicate copy in environment?

put simply, can without creating 2 identical dfs in environment:

> data("faithful") # old faithful geyser data datasets package  > x <- faithful   > ls() # have 2 identical dfs - x , faithful - in environment [1] "faithful" "x"   > remove(faithful) # i've removed 1 of redundant dfs 

try 1:

my first approach assign data("faithful") x. data() returns string. have df faithful , character vector x in environment.

> x <- data("faithful") > x [1] "faithful" # string, not df "faithful" datasets package  > ls() [1] "faithful" "x"   

try 2: tried little more sophisticated in second attempt.

> x <- get(data("faithful")) # works far assignment goes  > ls() # still duplicate copy [1] "faithful" "x" 

a short note motivation trying this. have r package 5 large data.frames - each having same columns. want efficiently generate same calculated columns on 5 data.frames. want use data() within list() constructor 5 data.frames list. want use llply() , mutate() plyr package iterate on dfs in list , create calculated columns each df. don't want have duplicate copies of 5 large datasets sitting in environment within shiny app ram limit.


edit: able use both of @henfiber's methods answer figure out how lazy-load entire data.frames named list.

the first command here works assigning data.frame new variable name.

# loads faithful variable x.  # note don't need use data() function load faithful > delayedassign("x",faithful)  

but wanted create named list x elements y = data(faithful), z=data(iris), etc.

i tried below , didn't work.

> x <- list(delayedassign("y",faithful),delayedassign("z", iris)) > ls() [1] "x" "y" "z" # x list 2 nulls, y & z promises faithful & iris 

but able construct list of lazy-loaded data.frame objects in following manner:

# define function provided henfiber getdata <- function(...) { e <- new.env() name <- data(..., envir = e)[1] e[[name]] }  # create list, gives 1 object "x" of class list # elements "y" , "z" data.frames x <- list(y=getdata(faithful),z=getdata(iris)) 

using helper function:

# define function getdata <- function(...) {     e <- new.env()     name <- data(..., envir = e)[1]     e[[name]] }  # load data calling getdata() x <- getdata("faithful") 

or using anonymous function:

x <- (function(...)get(data(...,envir = new.env())))("faithful") 

lazy evaluation

you should consider lazy loading data adding lazydata: true in description file of package.

if use rstudio, after running data("faithful"), you'll see @ environment panel "faithful" data.frame called "promise" (another less common name "thunk") , greyed out. means lazily evaluated r , not still loaded memory. can lazy load "x" variable delayedassign() function:

data("faithful")              # lazy load "faithful" delayedassign("x", faithful)  # lazy assign "x" reference "faithful" rm(faithful)                  # remove "faithful" 

still nothing has been loaded memory yet

summary(x)                    # x has been loaded , evaluated 

learn more lazy evaluation here.


Comments

Popular posts from this blog

powershell Start-Process exit code -1073741502 when used with Credential from a windows service environment -

twig - Using Twigbridge in a Laravel 5.1 Package -

c# - LINQ join Entities from HashSet's, Join vs Dictionary vs HashSet performance -