cash-.hdd.Rd
This method extracts a single variable from a hard drive data set (HDD). There is an automatic protection to avoid extracting too large data into memory. The bound is set by the function setHdd_extract.cap
.
# S3 method for hdd
$(x, name)
A HDD
object.
The variable name to be extracted.Note that there is an automatic protection for not trying to import data that would not fit into memory. The extraction cap is set with the function setHdd_extract.cap
.
It returns a vector.
By default if the expected size of the variable to extract is greater than the value given by getHdd_extract.cap
an error is raised.
For numeric variables, the expected size is exact. For non-numeric data, the expected size is a guess that considers all the non-numeric variables being of the same size. This may lead to an over or under estimation depending on the cases.
In any case, if your variable is large and you don't want to change the extraction cap (setHdd_extract.cap
), you can still extract the variable with sub-.hdd
for which there is no such protection.
Note that you cannot create variables with $
, e.g. like base_hdd$x_new <- something
. To create variables, use the [
instead (see sub-.hdd
).
See hdd
, sub-.hdd
and cash-.hdd
for the extraction and manipulation of out of memory data. For importation of
HDD data sets from text files: see txt2hdd
.
See hdd_slice
to apply functions to chunks of data (and create
HDD objects) and hdd_merge
to merge large files.
To create/reshape HDD objects from memory or from other HDD objects, see
write_hdd
.
To display general information from HDD objects: origin
,
summary.hdd
, print.hdd
,
dim.hdd
and names.hdd
.
# Toy example with iris data
# We first create a hdd dataset with approx. 100KB
hdd_path = tempfile() # => folder where the data will be saved
write_hdd(iris, hdd_path)
for(i in 1:10) write_hdd(iris, hdd_path, add = TRUE)
base_hdd = hdd(hdd_path)
summary(base_hdd) # => 11 files
#> Hard drive data of 48.7 KB. Made of 11 files.
#> Location: C:/Users/lrberge/AppData/Local/Temp/Rtmpa0wfuK/file568873707213/
#> 1650 lines, 5 variables.
# we can extract the data from the 11 files with '$':
pl = base_hdd$Sepal.Length
#
# Illustration of the protection mechanism:
#
# By default when extracting a variable with '$'
# and the size exceeds the cap (default is greater than 3GB)
# a confirmation is needed.
# You can set the cap with setHdd_extract.cap.
# Following asks for confirmation in interactive mode:
setHdd_extract.cap(sizeMB = 0.005) # new cap of 5KB
pl = base_hdd$Sepal.Length
# To extract the variable without changing the cap:
pl = base_hdd[, Sepal.Length] # => no size control is performed
# Resetting the default cap
setHdd_extract.cap()