---
title: "Query Generation"
author: "John Mount"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Query Generation}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
run_vignette <- requireNamespace("DBI", quietly = TRUE) && requireNamespace("RSQLite", quietly = TRUE)
```

This is a very brief place-holder example (not executed, as `SQLite` does not have the needed window functions).
For more details please see the fuller note: [`rquery README`](https://winvector.github.io/rquery/index.html).

The primary purpose of `rquery` is `SQL` query generation.  We demonstrate this below.

```{r ex, warning=FALSE, message=FALSE, include=FALSE, eval=run_vignette}
library("rquery")
library("wrapr")

# this db does not have window fns
my_db <- DBI::dbConnect(RSQLite::SQLite(), 
                        ":memory:")

dbopts <- rq_connection_tests(my_db)
print(dbopts)
options(dbopts)

# copy in example data
d_local <- build_frame(
   "subjectID", "surveyCategory"     , "assessmentTotal", "irrelevantCol1", "irrelevantCol2" |
   1          , "withdrawal behavior", 5                , "irrel1"        , "irrel2"         |
   1          , "positive re-framing", 2                , "irrel1"        , "irrel2"         |
   2          , "withdrawal behavior", 3                , "irrel1"        , "irrel2"         |
   2          , "positive re-framing", 4                , "irrel1"        , "irrel2"         )
rq_copy_to(my_db, 'd',
            d_local,
            temporary = TRUE, 
            overwrite = TRUE)
```

Note: in examples we use `rq_copy_to()` to create data.  This is only for the purpose of having
easy portable examples.  With big data the data is usually already in the remote database or
Spark system. The task is almost always to connect and work with this pre-existing remote data
and the method to do this is [`db_td()`](https://winvector.github.io/rquery/reference/db_td.html),
which builds a reference to a remote table given the table name.

```{r calc, eval=run_vignette}
# produce a hande to existing table
d <- db_td(my_db, "d")

scale <- 0.237

dq <- d %.>%
  extend(.,
         one = 1) %.>%
  extend(.,
         probability :=
           exp(assessmentTotal * scale)/
           sum(exp(assessmentTotal * scale)),
         count := sum(one),
         partitionby = 'subjectID') %.>%
  extend(.,
         rank := cumsum(one),
         partitionby = 'subjectID',
         orderby = c('probability', 'surveyCategory'))  %.>%
  rename_columns(., 'diagnosis' := 'surveyCategory') %.>%
  select_rows(., rank == count) %.>%
  select_columns(., c('subjectID', 
                      'diagnosis', 
                      'probability')) %.>%
  orderby(., 'subjectID')

class(my_db)
```

Presentation format (see also [`op_diagram()`](https://winvector.github.io/rquery/reference/op_diagram.html)):

```{r pf, echo=FALSE, comment = ' ', eval=run_vignette}
cat(format(dq))
```

`to_sql()` SQL (see also [`materialize()`](https://winvector.github.io/rquery/reference/materialize.html)):

```{r res, echo=FALSE, comment = ' ', eval=run_vignette}
sql <- to_sql(dq, db = my_db, source_limit = 1000)

cat(sql)
```


```{r cleanup, include=FALSE, eval=run_vignette}
DBI::dbDisconnect(my_db)
```