Skip to contents

Tool to transform any type of vector, or even combination of vectors, into an integer vector ranging from 1 to the number of unique values. This actually creates an unique identifier vector.

Usage

to_integer(
  ...,
  sorted = FALSE,
  add_items = FALSE,
  items.list = FALSE,
  multi.df = FALSE,
  multi.join = "_",
  internal = FALSE
)

Arguments

...

Vectors of any type, to be transformed in integer.

sorted

Logical, default is FALSE. Whether the integer vector should make reference to sorted values?

add_items

Logical, default is FALSE. Whether to add the unique values of the original vector(s). If requested, an attribute items is created containing the values (alternatively, they can appear in a list if items.list=TRUE).

items.list

Logical, default is FALSE. Only used if add_items=TRUE. If TRUE, then a list of length 2 is returned with x the integer vector and items the vector of items.

multi.df

Logical, default is FALSE. If TRUE then a data.frame listing the unique elements is returned in the form of a data.frame. Ignored if add_items = FALSE.

multi.join

Character scalar used to join the items of multiple vectors. The default is "_". Ignored if add_items = FALSE.

internal

Logical, default is FALSE. For programming only. If this function is used within another function, setting internal = TRUE is needed to make the evaluation of ... valid. End users of to_integer should not care.

Value

Reruns a vector of the same length as the input vectors. If add_items=TRUE and items.list=TRUE, a list of two elements is returned: x

being the integer vector and items being the unique values to which the values in x make reference.

Author

Laurent Berge

Examples


x1 = iris$Species
x2 = as.integer(iris$Sepal.Length)

# transforms the species vector into integers
to_integer(x1)
#>   [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#>  [38] 1 1 1 1 1 1 1 1 1 1 1 1 1 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
#>  [75] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 2 2 2 2 2 2 2 2 2 2 2
#> [112] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
#> [149] 2 2

# To obtain the "items":
to_integer(x1, add_items = TRUE)
#>   [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#>  [38] 1 1 1 1 1 1 1 1 1 1 1 1 1 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
#>  [75] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 2 2 2 2 2 2 2 2 2 2 2
#> [112] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
#> [149] 2 2
#> attr(,"items")
#> [1] "setosa"     "virginica"  "versicolor"
# same but in list form
to_integer(x1, add_items = TRUE, items.list = TRUE)
#> $x
#>   [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#>  [38] 1 1 1 1 1 1 1 1 1 1 1 1 1 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
#>  [75] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 2 2 2 2 2 2 2 2 2 2 2
#> [112] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
#> [149] 2 2
#> 
#> $items
#> [1] "setosa"     "virginica"  "versicolor"
#> 

# transforms x2 into an integer vector from 1 to 4
to_integer(x2, add_items = TRUE)
#>   [1] 1 2 2 2 1 1 2 1 2 2 1 2 2 2 1 1 1 1 1 1 1 1 2 1 2 1 1 1 1 2 2 1 1 1 2 1 1
#>  [38] 2 2 1 1 2 2 1 1 2 1 2 1 1 3 4 4 1 4 1 4 2 4 1 1 1 4 4 1 4 1 1 4 1 1 4 4 4
#>  [75] 4 4 4 4 4 1 1 1 1 4 1 4 4 4 1 1 1 4 1 1 1 1 1 4 1 1 4 1 3 4 4 3 2 3 4 3 4
#> [112] 4 4 1 1 4 4 3 3 4 4 1 3 4 4 3 4 4 4 3 3 3 4 4 4 3 4 4 4 4 4 4 1 4 4 4 4 4
#> [149] 4 1
#> attr(,"items")
#> [1] 5 4 7 6

# To have the sorted items:
to_integer(x2, add_items = TRUE, sorted = TRUE)
#>   [1] 2 1 1 1 2 2 1 2 1 1 2 1 1 1 2 2 2 2 2 2 2 2 1 2 1 2 2 2 2 1 1 2 2 2 1 2 2
#>  [38] 1 1 2 2 1 1 2 2 1 2 1 2 2 4 3 3 2 3 2 3 1 3 2 2 2 3 3 2 3 2 2 3 2 2 3 3 3
#>  [75] 3 3 3 3 3 2 2 2 2 3 2 3 3 3 2 2 2 3 2 2 2 2 2 3 2 2 3 2 4 3 3 4 1 4 3 4 3
#> [112] 3 3 2 2 3 3 4 4 3 3 2 4 3 3 4 3 3 3 4 4 4 3 3 3 4 3 3 3 3 3 3 2 3 3 3 3 3
#> [149] 3 2
#> attr(,"items")
#> [1] 4 5 6 7

# The result can safely be used as an index
res = to_integer(x2, add_items = TRUE, sorted = TRUE, items.list = TRUE)
all(res$items[res$x] == x2)
#> [1] TRUE


#
# Multiple vectors
#

to_integer(x1, x2, add_items = TRUE)
#>   [1]  2  1  1  1  2  2  1  2  1  1  2  1  1  1  2  2  2  2  2  2  2  2  1  2  1
#>  [26]  2  2  2  2  1  1  2  2  2  1  2  2  1  1  2  2  1  1  2  2  1  2  1  2  2
#>  [51]  6  5  5  4  5  4  5  3  5  4  4  4  5  5  4  5  4  4  5  4  4  5  5  5  5
#>  [76]  5  5  5  5  4  4  4  4  5  4  5  5  5  4  4  4  5  4  4  4  4  4  5  4  4
#> [101]  9  8 10  9  9 10  7 10  9 10  9  9  9  8  8  9  9 10 10  9  9  8 10  9  9
#> [126] 10  9  9  9 10 10 10  9  9  9 10  9  9  9  9  9  9  8  9  9  9  9  9  9  8
#> attr(,"items")
#>  [1] "setosa_4"     "setosa_5"     "versicolor_4" "versicolor_5" "versicolor_6"
#>  [6] "versicolor_7" "virginica_4"  "virginica_5"  "virginica_6"  "virginica_7" 

# You can use multi.join to handle the join of the items:
to_integer(x1, x2, add_items = TRUE, multi.join = "; ")
#>   [1]  2  1  1  1  2  2  1  2  1  1  2  1  1  1  2  2  2  2  2  2  2  2  1  2  1
#>  [26]  2  2  2  2  1  1  2  2  2  1  2  2  1  1  2  2  1  1  2  2  1  2  1  2  2
#>  [51]  6  5  5  4  5  4  5  3  5  4  4  4  5  5  4  5  4  4  5  4  4  5  5  5  5
#>  [76]  5  5  5  5  4  4  4  4  5  4  5  5  5  4  4  4  5  4  4  4  4  4  5  4  4
#> [101]  9  8 10  9  9 10  7 10  9 10  9  9  9  8  8  9  9 10 10  9  9  8 10  9  9
#> [126] 10  9  9  9 10 10 10  9  9  9 10  9  9  9  9  9  9  8  9  9  9  9  9  9  8
#> attr(,"items")
#>  [1] "setosa; 4"     "setosa; 5"     "versicolor; 4" "versicolor; 5"
#>  [5] "versicolor; 6" "versicolor; 7" "virginica; 4"  "virginica; 5" 
#>  [9] "virginica; 6"  "virginica; 7"