### Data Frames as Vectors of Rows

Recently, I have been working a lot on {vctrs}. This package is an attempt to analyze the atomic types in R, such as `integer`

, `character`

and `double`

, alongside the recursive types of `list`

and `data.frame`

, to extract a set of common principles. From this analysis, a growing toolkit of functions for working with vector types has developed around two themes of *size* and *prototype*. vctrs is a *fun* package to work on, and even more fun to build on top of.

The goal of this post is two fold. First, I want to show off a few functions and packages that have been built using this new toolkit which contribute to why I think this package is so fun. Second, I’d like to introduce a shift in the way you might normally think about data frames, *from a vector of columns to a vector of rows*.
Hadley is the one who recognized that viewing data frames from this angle opens up powerful new workflows that are especially useful for data analysis.

If you’ve never heard of vctrs before, there’s a reason for that. For the most part, it’s a developer focused package, and honestly if you never knew this package existed, but still used the higher level packages that were built on top of it, then we’ve done our job. A few examples of packages that rely heavily on vctrs right now are:

- tidyr’s
`pivot_longer()`

and`pivot_wider()`

, along with a number of other existing functions that recently got rewritten using vctrs principles - slide, for working with window functions (like moving averages)
- rray, an array manipulation library
- tibble’s development version

```
library(vctrs)
library(tibble)
```

`c()`

As you gain more experience working with R, you eventually start to learn about how certain data structures are implemented. A data frame, for example, is really a list where each element of the list is a single column. In other words, a data frame is a vector of columns.

One way to see this is by the fact that `length()`

returns the number of columns.

```
df <- tibble(
x = 1:4,
y = c("a", "b", "a", "a"),
z = c("x", "x", "y", "x")
)
df
```

```
## # A tibble: 4 x 3
## x y z
## <int> <chr> <chr>
## 1 1 a x
## 2 2 b x
## 3 3 a y
## 4 4 a x
```

`length(df)`

`## [1] 3`

You can also check that a data frame is a list by calling `is.list()`

.

`is.list(df)`

`## [1] TRUE`

This underlying assumption that a data frame is a vector of columns is deeply rooted into a number of R’s core functions, and is often used as a fallback when behavior is otherwise ill-defined. Consider, as an example, calling `c(df, df)`

to “combine” the data frame with itself.

`c(df, df)`

```
## $x
## [1] 1 2 3 4
##
## $y
## [1] "a" "b" "a" "a"
##
## $z
## [1] "x" "x" "y" "x"
##
## $x
## [1] 1 2 3 4
##
## $y
## [1] "a" "b" "a" "a"
##
## $z
## [1] "x" "x" "y" "x"
```

Our data frames have combined to become a list! In a way, this behavior is consistent with the principle that a data frame is a list of columns. It follows the invariant (read: “unbreakable principle”) of:

`length(c(x, y)) == length(x) + length(y)`

`length(df) + length(df)`

`## [1] 6`

`length(c(df, df)) == length(df) + length(df)`

`## [1] TRUE`

Is there any other type of output that makes sense? If we think of a data frame as a vector of columns, then no, because it makes sense to end up with something of length 6 after combining. However, I’d argue that if we flip our understanding of data frames from a vector of columns to a vector of rows, then another solution comes forward which offers a different result that I have begun to find pretty attractive.

`vec_size()`

Much of this particular section is adapted from the vctrs vignette on size.

To start the process of thinking about a “vector of rows”, look again to `df`

. A single “row” would be:

`df[1,]`

```
## # A tibble: 1 x 3
## x y z
## <int> <chr> <chr>
## 1 1 a x
```

With that in mind, `df`

would be considered a vector of 4 rows. It would be nice to have a function that returned this information as a building block to work off of (as shown before, `length()`

won’t do). We could try and use `nrow()`

, which gives us what we want, but it returns `NULL`

when given an individual column.

`nrow(df)`

`## [1] 4`

`nrow(df$x)`

`## NULL`

What I’m looking for is a function that returns a value that is equivalent for the data frame itself and for any column of the data frame. Put another way, I’m really after the “number of observations”. One other option is to use `NROW()`

.

`NROW(df)`

`## [1] 4`

`NROW(df$x)`

`## [1] 4`

This looks good, but what happens if you give it some “non-vector” input? A practical way to think about that is something that isn’t allowed to exist as a column of a data frame, for example, a function, or an `lm`

object.

`data.frame(x = mean)`

`## Error in as.data.frame.default(x[[i]], optional = TRUE): cannot coerce class '"function"' to a data.frame`

`NROW(mean)`

`## [1] 1`

```
lm_cars <- lm(mpg ~ cyl, data = mtcars)
data.frame(x = lm_cars)
```

`## Error in as.data.frame.default(x[[i]], optional = TRUE, stringsAsFactors = stringsAsFactors): cannot coerce class '"lm"' to a data.frame`

```
# Treats it as a list
NROW(lm_cars)
```

`## [1] 12`

These objects are considered *scalar* types rather than *vector* types. They are “scalar” in the sense that you only ever consider them one at a time. Even though `lm_cars`

is *technically* implemented as a list, we look at it is as a single linear model object.

Compare that to a double *vector* like `c(1, 2, 3)`

which is made up of 3 observations.

For our purposes, it’s valuable to keep scalar and vector types distinct, so it would be nice if an error was thrown for scalars to indicate that they don’t really have this “number of observations” property that we are after.

Motivated by this, the concept of `size`

was created in vctrs to capture the invariants that were desired. In particular:

- It is the length of 1d vectors.
- It is the number of rows of data frames, matrices, and arrays.
- It throws error for non vectors.

The vctrs function, `vec_size()`

, is the resulting implementation of this concept.

`vec_size(df)`

`## [1] 4`

`vec_size(df$x)`

`## [1] 4`

`vec_size(mean)`

`## `x` must be a vector, not a function`

`vec_c()`

Armed with the concept of size, let’s take another look at `c(df, df)`

. If the length invariant for `c()`

looks like:

`length(c(x, y)) == length(x) + length(y)`

then imagine what would happen if `length()`

was swapped with `vec_size()`

:

`vec_size(c(x, y)) =?= vec_size(x) + vec_size(y)`

Does this invariant hold? For 1d vectors, it does because `vec_size()`

and `length()`

are essentially the same. But for data frames, we’ve seen that `vec_size()`

and `length()`

are different, and `c()`

was built with `length()`

in mind, so it might not. In fact, looking at our original example of `c(df, df)`

proves that it doesn’t hold:

`vec_size(df) + vec_size(df)`

`## [1] 8`

`vec_size(c(df, df))`

`## [1] 6`

We’d like this invariant to hold, but `c()`

isn’t the right tool for the job. Instead, an alternative, `vec_c()`

, was built that builds off of `vec_size()`

and this invariant. The invariant that does hold actually looks like:

`vec_size(vec_c(x, y)) == vec_size(x) + vec_size(y)`

So what does `vec_c()`

do? For 1d vectors it acts like `c()`

, as you might expect.

`vec_c(1:2, 3)`

`## [1] 1 2 3`

But what about with `vec_c(df, df)`

? Based on the fact that `vec_size(df) + vec_size(df) = 8`

, at the very least we know it should return something with a size of 8. To accomplish this, rather than coercing to a list and combining the columns together, `vec_c()`

instead leaves them as data frames and *combines the rows*.

`vec_c(df, df)`

```
## # A tibble: 8 x 3
## x y z
## <int> <chr> <chr>
## 1 1 a x
## 2 2 b x
## 3 3 a y
## 4 4 a x
## 5 1 a x
## 6 2 b x
## 7 3 a y
## 8 4 a x
```

If you view a data frame as a vector of rows, this makes complete sense. We start with a vector of 4 rows, and add another vector of 4 rows, so we should end up with a vector of 8 rows, i.e. a data frame with size 8.

What happens if we combine `df`

with a data frame containing one row but an entirely new column? Again, we “know” from our size invariant that we should get something back with a size of 5.

```
df_w <- tibble(w = 1)
vec_c(df, df_w)
```

```
## # A tibble: 5 x 4
## x y z w
## <int> <chr> <chr> <dbl>
## 1 1 a x NA
## 2 2 b x NA
## 3 3 a y NA
## 4 4 a x NA
## 5 NA <NA> <NA> 1
```

The result is a data frame with 5 rows, and a union of the columns coming from each of the two individual data frames. Without getting too much into it, the fact that the result is a “tibble with 4 columns: x (int), y (chr), z (chr), and w (dbl)” comes from the other half of what vctrs offers, the *prototype*. `vec_c()`

found the “common type” between `df`

and `df_w`

, which is the union holding those 4 columns.

What about combining `df`

with something like the double vector, `c(1, 2)`

? We’d expect a size of 6 (4 from `df`

and 2 from the vector), but we actually get an error because the size is only half of the story.

`vec_c(df, c(1, 2))`

```
## No common type for `x` <tbl_df<
## x: integer
## y: character
## z: character
## >> and `y` <double>.
```

In this case, there is no common type between a data frame and a double vector, so you can’t combine them together.

Compare that with the result from `c()`

which upholds its length invariant giving a result of length 5.

`c(df, c(1, 2))`

```
## $x
## [1] 1 2 3 4
##
## $y
## [1] "a" "b" "a" "a"
##
## $z
## [1] "x" "x" "y" "x"
##
## [[4]]
## [1] 1
##
## [[5]]
## [1] 2
```

`vec_match()`

Treatment of a data frame as a vector of rows extends well past `vec_c()`

, and bleeds into many other vctrs tools where we have been experimenting with this idea. As one more example, we’ll take a look at `match()`

. If you aren’t familiar with `match()`

, it tells you the location of `x`

inside a `table`

. Another way to think about this is that you want to find a `needle`

in a `haystack`

. More concretely:

```
# Where is `"a"` inside the vector `c("b", "c", "a", "d")`?
match(x = "a", table = c("b", "c", "a", "d"))
```

`## [1] 3`

Now imagine that I want to use a data frame as my `table`

. I might be interested in locating a few particular rows inside that `table`

.

```
needles <- tibble(x = 3:4, y = c("a", "b"), z = c("y", "y"))
haystack <- df
needles
```

```
## # A tibble: 2 x 3
## x y z
## <int> <chr> <chr>
## 1 3 a y
## 2 4 b y
```

`haystack`

```
## # A tibble: 4 x 3
## x y z
## <int> <chr> <chr>
## 1 1 a x
## 2 2 b x
## 3 3 a y
## 4 4 a x
```

Here, the first row of `needles`

is row 3 of the `haystack`

, and the second row of `needles`

is not in the `haystack`

at all. Let’s try with `match()`

.

`match(needles, haystack)`

`## [1] NA NA NA`

🤔 so what happened here? With R’s (completely reasonable) treatment of `needles`

as a vector of columns, it essentially first converted `needles`

and `haystack`

into lists, and then tried to locate each column of `needles`

inside `haystack`

. There were 3 columns, and none of them were found, so `NA`

was returned 3 times. To actually see a match, we could instead provide a list containing the `y`

column of `haystack`

.

`match(list(haystack$y), haystack)`

`## [1] 2`

If we instead treat both `needles`

and `haystack`

as vectors of rows, what we are really trying to do is find one set of rows inside another set of rows. In vctrs, we’ve created `vec_match()`

for this.

`vec_match(needles, haystack)`

`## [1] 3 NA`

Again, it’s not that anything R is doing is *wrong*, or even that this is “better”. This just answers a different question by looking at a data frame from a different angle. Additionally, we don’t actually lose anything in vctrs by thinking about data frames in this way. If we want the `match()`

behavior, we can just `unclass()`

our data frames to turn them into explicit lists, and then `vec_match()`

works exactly the same.
Even though `c()`

and `match()`

treat data frames as vectors of columns, not all R functions do. At the end of the post I discuss how `split()`

and `unique()`

actually treat them as vectors of rows, and what the vctrs equivalents are.

`slide()`

Lastly, I’d like to show an example of a package that builds on top of vctrs principles. {slide} is my attempt at a package for working with “window functions”, functions that enable some kind of “rolling” analysis. A moving average, rolling regression, and even a cumulative sum are all examples of usage of window functions.

```
# devtools::install_github("DavisVaughan/slide")
library(slide)
```

`slide()`

works similarly to `purrr::map()`

in that you provide it a vector, `.x`

, and a function, `.f`

, to apply to each slice of `.x`

. One difference is that you have additional options to control the window of `.x`

you apply `.f`

to. For example, below we construct a sliding window of size 3, asking for “the current value along with 2 values before this one”. The function we apply is to just print out the current value of `.x`

so we can see what is happening.

`slide(1:5, ~.x, .before = 2)`

```
## [[1]]
## [1] 1
##
## [[2]]
## [1] 1 2
##
## [[3]]
## [1] 1 2 3
##
## [[4]]
## [1] 2 3 4
##
## [[5]]
## [1] 3 4 5
```

We could also perform a rolling average by switching `~.x`

for `mean`

, and, like with purrr, replacing `slide()`

with `slide_dbl()`

.

`slide_dbl(1:5, mean, .before = 2)`

`## [1] 1.0 1.5 2.0 3.0 4.0`

Because `slide()`

builds on vctrs, it is meaningful to talk about the invariants of the function. For example, the size invariant of `slide()`

is that:

`vec_size(slide(.x)) == vec_size(.x)`

In other words, `slide()`

always returns an output that has the same *size* as its input. This is similar to how `map()`

works, with one major difference. Like `c()`

, `map()`

returns a vector with the same *length* as its input. This means that `map()`

treats a data frame as a vector of columns.

```
library(purrr)
map(df, ~.x)
```

```
## $x
## [1] 1 2 3 4
##
## $y
## [1] "a" "b" "a" "a"
##
## $z
## [1] "x" "x" "y" "x"
```

A major breakthrough for me was that, to uphold the invariant, `slide()`

must treat a data frame as a vector of rows, meaning that it should *iterate rowwise over .x*.

`slide(df, ~.x)`

```
## [[1]]
## # A tibble: 1 x 3
## x y z
## <int> <chr> <chr>
## 1 1 a x
##
## [[2]]
## # A tibble: 1 x 3
## x y z
## <int> <chr> <chr>
## 1 2 b x
##
## [[3]]
## # A tibble: 1 x 3
## x y z
## <int> <chr> <chr>
## 1 3 a y
##
## [[4]]
## # A tibble: 1 x 3
## x y z
## <int> <chr> <chr>
## 1 4 a x
```

This provides an alternative to some `pmap()`

solutions that have been used previously, like the ones in Jenny Bryan’s GitHub repo of row oriented workflows. Consider this example modified from the repo, where you have a data frame of parameters that you want to pass on to `runif()`

in order to call it multiple times with different parameter combinations. Additionally, the column names don’t currently match the argument names of `runif()`

, so you either have to rename on the fly, or wrap it with a function.

```
library(dplyr)
parameters <- tibble(
n = 1:3,
minimum = c(0, 10, 100),
maximum = c(1, 100, 1000)
)
set.seed(12)
parameters %>%
rename(min = minimum, max = maximum) %>%
pmap(runif)
```

```
## [[1]]
## [1] 0.06936092
##
## [[2]]
## [1] 83.59977 94.83596
##
## [[3]]
## [1] 342.4437 252.4133 130.5061
```

```
set.seed(12)
my_runif <- function(n, minimum, maximum) {
runif(n, minimum, maximum)
}
pmap(parameters, my_runif)
```

```
## [[1]]
## [1] 0.06936092
##
## [[2]]
## [1] 83.59977 94.83596
##
## [[3]]
## [1] 342.4437 252.4133 130.5061
```

With `slide()`

being a row wise iterator, you have access to the entire data frame row at each iteration as `.x`

, meaning you can just do:

```
set.seed(12)
slide(parameters, ~runif(n = .x$n, min = .x$minimum, max = .x$maximum))
```

```
## [[1]]
## [1] 0.06936092
##
## [[2]]
## [1] 83.59977 94.83596
##
## [[3]]
## [1] 342.4437 252.4133 130.5061
```

### Conclusion

Treatment of a data frame as a vector of rows is a fairly novel concept in R, because of the way that data frames were originally implemented as a list of columns. But viewing them in this way can be incredibly powerful, especially for data analysis work. I, for one, am looking forward to seeing this concept explored more in the future, both in vctrs and in other packages built on top of it.

### Extra - `unique()`

and `split()`

On the vctrs side, we are trying to be consistent in our treatment of data frames as vectors of rows. However, it is worth mentioning that there are some functions in R where data frames are already treated this way, rather than as a vector of columns. Two in particular are `unique()`

and `split()`

.

`unique()`

With `unique()`

, uniqueness is actually determined using data frame rows, not columns. Looking at columns `y`

and `z`

of `df`

, we can see that rows 1 and 4 are duplicates. Calling `unique()`

on this removes the duplicate row.

```
df_yz <- df[, c("y", "z")]
df_yz
```

```
## # A tibble: 4 x 2
## y z
## <chr> <chr>
## 1 a x
## 2 b x
## 3 a y
## 4 a x
```

`unique(df_yz)`

```
## # A tibble: 3 x 2
## y z
## <chr> <chr>
## 1 a x
## 2 b x
## 3 a y
```

It’s actually pretty interesting to see how this one works. If you look into `unique.data.frame()`

, you’ll see that it calls `duplicated.data.frame()`

. In there is this somewhat cryptic line that actually does the rowwise check:

`duplicated(do.call(Map, `names<-`(c(list, x), NULL)), fromLast = fromLast)`

Breaking this down, it first combines the function `list()`

with the data frame to end up with:

`c(list, df_yz)`

```
## [[1]]
## function (...) .Primitive("list")
##
## $y
## [1] "a" "b" "a" "a"
##
## $z
## [1] "x" "x" "y" "x"
```

which it then removes the names of:

``names<-`(c(list, df_yz), NULL)`

```
## [[1]]
## function (...) .Primitive("list")
##
## [[2]]
## [1] "a" "b" "a" "a"
##
## [[3]]
## [1] "x" "x" "y" "x"
```

Next it uses `do.call()`

to call `Map()`

, which is a wrapper around `mapply()`

meaning that it will repeatedly call `list()`

on parallel elements of the columns of `df_yz`

. Visually that means we end up with list elements holding the rows, which `duplicated()`

is then run on to locate the duplicates.

`do.call(Map, `names<-`(c(list, df_yz), NULL))`

```
## $a
## $a[[1]]
## [1] "a"
##
## $a[[2]]
## [1] "x"
##
##
## $b
## $b[[1]]
## [1] "b"
##
## $b[[2]]
## [1] "x"
##
##
## $a
## $a[[1]]
## [1] "a"
##
## $a[[2]]
## [1] "y"
##
##
## $a
## $a[[1]]
## [1] "a"
##
## $a[[2]]
## [1] "x"
```

`duplicated(do.call(Map, `names<-`(c(list, df_yz), NULL)))`

`## [1] FALSE FALSE FALSE TRUE`

In vctrs there is `vec_unique()`

. Because `unique()`

already works row wise, they are essentially equivalent in terms of functionality with data frames. However, there are two key differences. First, because `vec_unique()`

’s handling of data frames is in C, it does end up being faster.

```
# row bind df_yz 10000 times, making a 40000 row data frame
large_df <- vec_rbind(!!!rep_len(list(df_yz), 10000))
dim(large_df)
```

`## [1] 40000 2`

```
bench::mark(
unique(large_df),
vec_unique(large_df)
)
```

```
## # A tibble: 2 x 6
## expression min median `itr/sec` mem_alloc `gc/sec`
## <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
## 1 unique(large_df) 28.38ms 31.6ms 31.7 1.42MB 137.
## 2 vec_unique(large_df) 4.34ms 4.6ms 216. 429.61KB 0
```

Second, `unique()`

doesn’t handle the idea of a *packed* data frame well. This is a relatively new idea in the tidyverse, and it isn’t one that we want to expose users to very much yet, but it is powerful. A packed data frame is a data frame where one of the columns is another data frame. This is different from a list-column of data frames. You can create one by providing `tibble()`

another `tibble()`

as a column along with a name for that data frame column.

```
df_packed <- tibble(
x = tibble(
a = c(1, 1, 1, 3),
b = c(1, 1, 3, 3)
),
y = c(1, 1, 1, 2)
)
# Both `$a` and `$b` are columns in the data frame column, `x`
df_packed
```

```
## # A tibble: 4 x 2
## x$a $b y
## <dbl> <dbl> <dbl>
## 1 1 1 1
## 2 1 1 1
## 3 1 3 1
## 4 3 3 2
```

```
# `x` itself is another data frame
df_packed$x
```

```
## # A tibble: 4 x 2
## a b
## <dbl> <dbl>
## 1 1 1
## 2 1 1
## 3 1 3
## 4 3 3
```

Even though you can create one of these, as an end user there isn’t much yet that you can do with them, so you shouldn’t have to worry about this very much.

Looking at `df_packed`

, the unique rows are 1, 3, and 4, which `vec_unique()`

can determine, but `unique()`

doesn’t correctly pick up on.

`vec_unique(df_packed)`

```
## # A tibble: 3 x 2
## x$a $b y
## <dbl> <dbl> <dbl>
## 1 1 1 1
## 2 1 3 1
## 3 3 3 2
```

`unique(df_packed)`

```
## # A tibble: 3 x 2
## x$a $b y
## <dbl> <dbl> <dbl>
## 1 1 1 1
## 2 1 1 1
## 3 3 3 2
```

As packed data frames become more prevalent in the tidyverse, it will be nice to have tools that handle them consistently.
For example, there is already `tidyr::pack()`

and `tidyr::unpack()`

, which helps power `tidyr::unnest()`

.

`split()`

`split(x, by)`

will divide up `x`

into groups using `by`

to determine where the unique groups are. It assumes `by`

is a factor, and will coerce your input to a factor if it isn’t already one. Like `unique()`

, it will slice up a data frame by rows rather than by columns.

`split(df, df$y)`

```
## $a
## # A tibble: 3 x 3
## x y z
## <int> <chr> <chr>
## 1 1 a x
## 2 3 a y
## 3 4 a x
##
## $b
## # A tibble: 1 x 3
## x y z
## <int> <chr> <chr>
## 1 2 b x
```

One thing about `split()`

is that it uses the unique values as labels on the list elements. To take a slightly different approach, `vec_split()`

was created, which returns a data frame instead, holding the unique `key`

values in their own parallel column. `vec_split()`

returns a data frame to keep vctrs lightweight, but the print method for these can be a little complex, and I think tibble’s print method does a nicer job.

```
df_split <- vec_split(df, df$y)
df_split <- as_tibble(df_split)
df_split
```

```
## # A tibble: 2 x 2
## key val
## <chr> <list<df[,3]>>
## 1 a [3 × 3]
## 2 b [1 × 3]
```

`df_split$val`

```
## <list_of<
## tbl_df<
## x: integer
## y: character
## z: character
## >
## >[2]>
## [[1]]
## # A tibble: 3 x 3
## x y z
## <int> <chr> <chr>
## 1 1 a x
## 2 3 a y
## 3 4 a x
##
## [[2]]
## # A tibble: 1 x 3
## x y z
## <int> <chr> <chr>
## 1 2 b x
```

One useful feature of `vec_split()`

is that it doesn’t expect a factor as the second argument, which means that a data frame can be provided to split by, and since uniqueness is determined row wise this allows us to split by multiple columns. The `key`

ends up as the unique rows of the data frame, meaning that the `key`

is actually a data frame column, creating a packed data frame!

```
df_multi_split <- vec_split(df, df[c("y", "z")])
df_multi_split <- as_tibble(df_multi_split)
df_multi_split
```

```
## # A tibble: 3 x 2
## key$y $z val
## <chr> <chr> <list<df[,3]>>
## 1 a x [2 × 3]
## 2 b x [1 × 3]
## 3 a y [1 × 3]
```

Technically you can provide `split()`

a data frame to split `by`

, but remember that it will try and treat it like a factor! Since the data frame is technically a list, it will run `interaction()`

on it first to get a single factor it can use to split by. Notice that this gives a level for `b.y`

which does not exist as a row in our data frame.

`df[c("y", "z")]`

```
## # A tibble: 4 x 2
## y z
## <chr> <chr>
## 1 a x
## 2 b x
## 3 a y
## 4 a x
```

`interaction(df[c("y", "z")])`

```
## [1] a.x b.x a.y a.x
## Levels: a.x b.x a.y b.y
```

This results in the following split, with a `b.y`

element with no rows which we may or may not have wanted.

`split(df, df[c("y", "z")])`

```
## $a.x
## # A tibble: 2 x 3
## x y z
## <int> <chr> <chr>
## 1 1 a x
## 2 4 a x
##
## $b.x
## # A tibble: 1 x 3
## x y z
## <int> <chr> <chr>
## 1 2 b x
##
## $a.y
## # A tibble: 1 x 3
## x y z
## <int> <chr> <chr>
## 1 3 a y
##
## $b.y
## # A tibble: 0 x 3
## # … with 3 variables: x <int>, y <chr>, z <chr>
```