SyntaxStudy
Sign Up
R Vector Utility Functions and String Vectors
R Beginner 1 min read

Vector Utility Functions and String Vectors

R ships with a rich set of built-in utility functions for working with vectors. Statistical summaries like mean(), median(), sd(), var(), min(), max(), range(), and quantile() all operate on numeric vectors. Sorting is handled by sort() (returns sorted values) and order() (returns indices that would sort the vector), the latter being especially useful for sorting data frames by a column. The which() function returns the indices of TRUE elements in a logical vector, making it easy to find matching positions. String vectors are character vectors, and base R provides a set of string manipulation functions: nchar() for length, toupper()/tolower() for case conversion, substr() for substrings, paste() and paste0() for concatenation, gsub() and sub() for pattern replacement (supporting regular expressions), and grepl() for logical pattern matching. The stringr package from the tidyverse offers a more consistent API for the same operations. Type-testing and conversion functions are indispensable when writing robust code. The is.* family (is.numeric(), is.character(), is.logical(), is.na(), is.vector()) lets you check types defensively. Combining these with which() and logical subsetting lets you filter and transform vector elements safely and expressively.
Example
# Numeric utilities
v <- c(3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5)

mean(v)           # 3.818...
median(v)         # 4
sd(v)             # 2.228...
var(v)            # 4.963...
range(v)          # 1 9
quantile(v, 0.75) # 5  (75th percentile)
cumsum(v)         # running total
diff(v)           # lag-1 differences

# Sorting and ordering
sort(v)                      # ascending
sort(v, decreasing = TRUE)   # descending
order(v)                     # indices that sort v ascending
v[order(v)]                  # same as sort(v)

# which() — find matching indices
which(v > 4)                 # positions where v > 4
which.min(v)                 # position of minimum
which.max(v)                 # position of maximum

# Set-like operations on vectors
a <- c(1, 2, 3, 4, 5)
b <- c(3, 4, 5, 6, 7)
union(a, b)       # 1 2 3 4 5 6 7
intersect(a, b)   # 3 4 5
setdiff(a, b)     # 1 2   (in a but not b)
4 %in% a          # TRUE

# String vectors
fruits <- c("Apple", "banana", "Cherry")
nchar(fruits)                         # 5 6 6
tolower(fruits)                       # all lowercase
toupper(fruits)                       # all uppercase
substr(fruits, 1, 3)                  # "App" "ban" "Che"
paste(fruits, "fruit", sep = "-")     # "Apple-fruit" ...
paste0("item_", 1:3)                  # "item_1" "item_2" "item_3"
gsub("a", "@", fruits, ignore.case = TRUE)  # replace 'a'
grepl("^[A-Z]", fruits)               # TRUE FALSE TRUE