--- title: "R mapping" author: "John Mount" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{R mapping} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- `rquery` re-maps a number of symbols during `SQL` translation. During expression parsing the internal `rquery` function `tokenize_call_for_SQL()` implements the following re-mappings from `R` idioms to `SQL` notation. ```{r} library("rquery") library("wrapr") show_translation <- function(strings) { vapply(strings, function(si) { format(rquery::tokenize_for_SQL(parse(text = si, keep.source = FALSE)[[1]], colnames = NULL)$parsed_toks) }, character(1)) } mapping_table <- data.frame( example = c('!x', 'is.na(x)', 'ifelse(a, b, c)', 'a^b', 'a%%b', 'a==b', 'a&&b', 'a&b', 'a||b', 'a|b', 'pmin(a, b)', 'pmax(a, b)'), stringsAsFactors = FALSE) mapping_table$translation <- show_translation(mapping_table$example) knitr::kable(mapping_table) ``` Note: not all possible mappings are implemented. For example we currently do not re-map `%in%`, preferring the user to explicitly work with [`set_indicator()`](https://winvector.github.io/rquery/reference/set_indicator.html) directly. In addition to this the database connectors can specify additional re-mappings. This can be found by building a formal connector and inspecting the re-mappings. ```{r} have_RSQLite <- requireNamespace("RSQLite", quietly = TRUE) ``` ```{r, eval=have_RSQLite} raw_RSQLite_connection <- DBI::dbConnect(RSQLite::SQLite(), ":memory:") RSQLite::initExtension(raw_RSQLite_connection) db <- rquery_db_info( connection = raw_RSQLite_connection, is_dbi = TRUE, connection_options = rq_connection_tests(raw_RSQLite_connection)) fn_name_map <- db$connection_options[[paste0("rquery.", rq_connection_name(db), ".", "fn_name_map")]] fn_name_map ``` We see above that "`mean`" is re-mapped to "`avg`". In all cases we can see what re-mappings happen by examining a query. ```{r, eval=have_RSQLite} d_local <- build_frame( "subjectID", "surveyCategory" , "assessmentTotal", "irrelevantCol1", "irrelevantCol2" | 1L , "withdrawal behavior", 5 , "irrel1" , "irrel2" | 1L , "positive re-framing", 2 , "irrel1" , "irrel2" | 3L , "withdrawal behavior", 3 , "irrel1" , "irrel2" | 3L , "positive re-framing", 2 , "irrel1" , "irrel2" | 3L , "other" , 1 , "irrel1" , "irrel2" ) table_handle <- rq_copy_to(db, 'd', d_local, temporary = TRUE, overwrite = TRUE) print(table_handle) ops <- table_handle %.>% project(., avg_total := avg(pmax(0, assessmentTotal)), groupby = "subjectID") cat(to_sql(ops, db)) ops %.>% execute(db, .) %.>% knitr::kable(.) ``` The basic mappings are stored in database option structures, and depend on the database. For example `MOD` is re-mapped back to `%` for `RSQLite`. ```{r, eval=have_RSQLite} rquery::rq_function_mappings(db) %.>% knitr::kable(.) ``` ```{r, eval=have_RSQLite} ops <- table_handle %.>% project(., groupby = "subjectID", n := 5, count := n(), mean := mean(assessmentTotal)) %.>% extend(., was_n := n) cat(to_sql(ops, db)) ops %.>% execute(db, .) %.>% knitr::kable(.) ``` Additional function re-mappings can be specified by user code. One such example is re-writing `MOD` as `%` for `RSQLite`. ```{r, eval=have_RSQLite} ops <- table_handle %.>% extend(., z := 1 + subjectID %% 3) %.>% select_columns(., c("subjectID", "z")) cat(to_sql(ops, db)) ops %.>% execute(db, .) %.>% knitr::kable(.) ``` `rqdatatable` also supplies some re-mappings (described [here](https://winvector.github.io/rqdatatable/articles/R_mapping.html)). This can allow us to use a uniform notation for tasks such as random number generation to allow portable pipelines. ```{r, eval=have_RSQLite} DBI::dbDisconnect(raw_RSQLite_connection) ```