Stand and Stock Table Workflows with Tidyverse

FOR 372

Elliot Shannon

2025-03-25

Goals

  • Use tidyverse workflows to build stand and stock tables.

  • Leverage group_by() grouping methods to compute estimates for populations of interest.

  • Understand complete() and pivot_wider() functions.

Example Stand-Level Estimates

Estimates for number of trees and wood volume per acre by species.
Species DBH (in) Trees/ac Volume (ft3/ac)
Abies balsamea 11.3 8 142.8
Betula papyrifera 14.8 8 269.6
Betula papyrifera 15.4 8 293.7
Pinus strobus 9.8 8 116.3
Pinus strobus 10.7 8 143.6
Pinus strobus 13.1 8 231.9

Stand tables summarize a quantitative discrete variable (e.g., stem count) grouped by one or more categorical (i.e., qualitative) variable (e.g., size class or species).

DBH Classes (in)
Species [6,10] (10,14] (14, 18] Totals
Abies balsamea 0 8 0 8
Betula papyrifera 0 0 16 16
Pinus strobus 8 16 0 24
Totals 8 24 16 48

Values are numbers of trees per acre.

Stock tables summarize a quantitative continuous variable (e.g., volume, weight, or basal area) grouped by one or more categorical variable.

DBH Classes (in)
Species [6,10] (10,14] (14, 18] Totals
Abies balsamea 0 142.8 0 142.8
Betula papyrifera 0 0 563.3 563.3
Pinus strobus 116.3 375.5 0 491.8
Totals 116.3 518.3 563.3 1197.9

Values are volume cubic feet per acre.

Each row and column combination in the table defines a unique population.

DBH Classes (in)
Species [6,10] (10,14] (14, 18]
Abies balsamea Pop 1 Pop 2 Pop 3
Betula papyrifera Pop 4 Pop 5 Pop 6
Pinus strobus Pop 7 Pop 8 Pop 9

Our task is to generate an estimate for each population.

Recall this example population

We take a sample

# Read in the stand data
stands <- read_csv("two_stands.csv")

# Filter only Stand 1 Overstory trees
# Select columns needed to build the stand and stock tables
o_stand <- stands %>% 
  filter(stand_id == 1, tree_type == "Overstory") %>%
  select(plot_id, tree_count, scientific_name, DBH_in, vol_cu_ft)

o_stand
# A tibble: 7 × 5
  plot_id tree_count scientific_name   DBH_in vol_cu_ft
    <dbl>      <dbl> <chr>              <dbl>     <dbl>
1       1          1 Abies balsamea      11.3      17.8
2       1          1 Pinus strobus        9.8      14.5
3       1          1 Pinus strobus       10.7      17.9
4       2          0 <NA>                 0         0  
5       3          1 Betula papyrifera   14.8      33.6
6       3          1 Betula papyrifera   15.4      36.6
7       3          1 Pinus strobus       13.1      28.9

Since there were no trees on Plot 2 in Stand 1, we preserve that zero count using the tree_count variable.

# A tibble: 7 × 5
  plot_id tree_count scientific_name   DBH_in vol_cu_ft
    <dbl>      <dbl> <chr>              <dbl>     <dbl>
1       1          1 Abies balsamea      11.3      17.8
2       1          1 Pinus strobus        9.8      14.5
3       1          1 Pinus strobus       10.7      17.9
4       2          0 <NA>                 0         0  
5       3          1 Betula papyrifera   14.8      33.6
6       3          1 Betula papyrifera   15.4      36.6
7       3          1 Pinus strobus       13.1      28.9

Why save this zero value?

A plot with no trees is part of the sample, reflects a characteristic of the population, and hence needs to be included as a zero when computing population parameter estimates.

We convert plot_id to a factor (effectively preserving Plot 2 in the dataset as an absent factor level) and remove the row with tree_count == 0.

# Make plot_id a factor then remove plots with zero trees.
o_stand <- o_stand %>% 
  mutate(plot_id = as.factor(plot_id)) %>% 
  filter(tree_count != 0)

o_stand
# A tibble: 6 × 5
  plot_id tree_count scientific_name   DBH_in vol_cu_ft
  <fct>        <dbl> <chr>              <dbl>     <dbl>
1 1                1 Abies balsamea      11.3      17.8
2 1                1 Pinus strobus        9.8      14.5
3 1                1 Pinus strobus       10.7      17.9
4 3                1 Betula papyrifera   14.8      33.6
5 3                1 Betula papyrifera   15.4      36.6
6 3                1 Pinus strobus       13.1      28.9

complete() function

Original dataframe
Student Subject Score
Elliot Math 75
Elliot Biology 80
Ben Biology 72
William Physics 98
Completed dataframe
Student Subject Score
Elliot Math 75
Elliot Biology 80
Elliot Physics NA
Ben Math NA
Ben Biology 72
Ben Physics NA
William Math NA
William Biology NA
William Physics 98

We will then add the DBH class column and use the complete() function to add zeros into the dataset where species and DBH class combinations (that define a population) were not observed.

# Add the DBH class column.
o_stand <- o_stand %>%
  mutate(DBH_4in = cut_width(DBH_in, width=4))

# Make implicit zeros explicit for species and DBH class 
# combinations not observed on plots.
o_stand <- o_stand %>% 
  complete(plot_id, scientific_name, DBH_4in, 
           fill = list(tree_count = 0, vol_cu_ft = 0))

We’ve made implicit zeros explicit.

# A tibble: 28 × 6
   plot_id scientific_name   DBH_4in tree_count DBH_in vol_cu_ft
   <fct>   <chr>             <fct>        <dbl>  <dbl>     <dbl>
 1 1       Abies balsamea    [6,10]           0   NA         0  
 2 1       Abies balsamea    (10,14]          1   11.3      17.8
 3 1       Abies balsamea    (14,18]          0   NA         0  
 4 1       Betula papyrifera [6,10]           0   NA         0  
 5 1       Betula papyrifera (10,14]          0   NA         0  
 6 1       Betula papyrifera (14,18]          0   NA         0  
 7 1       Pinus strobus     [6,10]           1    9.8      14.5
 8 1       Pinus strobus     (10,14]          1   10.7      17.9
 9 1       Pinus strobus     (14,18]          0   NA         0  
10 2       Abies balsamea    [6,10]           0   NA         0  
11 2       Abies balsamea    (10,14]          0   NA         0  
12 2       Abies balsamea    (14,18]          0   NA         0  
13 2       Betula papyrifera [6,10]           0   NA         0  
14 2       Betula papyrifera (10,14]          0   NA         0  
15 2       Betula papyrifera (14,18]          0   NA         0  
16 2       Pinus strobus     [6,10]           0   NA         0  
17 2       Pinus strobus     (10,14]          0   NA         0  
18 2       Pinus strobus     (14,18]          0   NA         0  
19 3       Abies balsamea    [6,10]           0   NA         0  
20 3       Abies balsamea    (10,14]          0   NA         0  
21 3       Abies balsamea    (14,18]          0   NA         0  
22 3       Betula papyrifera [6,10]           0   NA         0  
23 3       Betula papyrifera (10,14]          0   NA         0  
24 3       Betula papyrifera (14,18]          1   14.8      33.6
25 3       Betula papyrifera (14,18]          1   15.4      36.6
26 3       Pinus strobus     [6,10]           0   NA         0  
27 3       Pinus strobus     (10,14]          1   13.1      28.9
28 3       Pinus strobus     (14,18]          0   NA         0  

Now o_stand is ready is ready for the workflow that generates stand-level estimates for trees per acre and volume per acre.

Moving forward, we will group by species and DBH class to partition data into the nine populations.

First, we add a TF trees/ac column to o_stand. Recall that the tree factor is calulcated as

\[TF = \frac{43560}{\pi R^2}\]

where \(R\) is the radius of the plot (ft).

Here, we have \(R = 24\) ft.

o_stand <- o_stand %>% 
  mutate(TF = 43560 / (pi * 24^2))

Step 1, compute per unit area plot-level summaries.

plot_summary <- o_stand %>% 
  group_by(scientific_name, DBH_4in, plot_id) %>% 
  summarize(trees_per_ac = sum(tree_count * TF),
            vol_per_ac = sum(tree_count * TF * vol_cu_ft),
            .groups = "drop_last")

plot_summary
# A tibble: 27 × 5
# Groups:   scientific_name, DBH_4in [9]
   scientific_name   DBH_4in plot_id trees_per_ac vol_per_ac
   <chr>             <fct>   <fct>          <dbl>      <dbl>
 1 Abies balsamea    [6,10]  1                0           0 
 2 Abies balsamea    [6,10]  2                0           0 
 3 Abies balsamea    [6,10]  3                0           0 
 4 Abies balsamea    (10,14] 1               24.1       428.
 5 Abies balsamea    (10,14] 2                0           0 
 6 Abies balsamea    (10,14] 3                0           0 
 7 Abies balsamea    (14,18] 1                0           0 
 8 Abies balsamea    (14,18] 2                0           0 
 9 Abies balsamea    (14,18] 3                0           0 
10 Betula papyrifera [6,10]  1                0           0 
# ℹ 17 more rows

Step 2, compute per unit area estimates for each stand.

stand_estimates <- plot_summary %>% 
  summarize(y_bar_trees = mean(trees_per_ac),
            y_bar_vol = mean(vol_per_ac))

stand_estimates 
# A tibble: 9 × 4
# Groups:   scientific_name [3]
  scientific_name   DBH_4in y_bar_trees y_bar_vol
  <chr>             <fct>         <dbl>     <dbl>
1 Abies balsamea    [6,10]         0           0 
2 Abies balsamea    (10,14]        8.02      143.
3 Abies balsamea    (14,18]        0           0 
4 Betula papyrifera [6,10]         0           0 
5 Betula papyrifera (10,14]        0           0 
6 Betula papyrifera (14,18]       16.0       563.
7 Pinus strobus     [6,10]         8.02      116.
8 Pinus strobus     (10,14]       16.0       376.
9 Pinus strobus     (14,18]        0           0 

pivot_wider() function

Original dataframe
Player Stat Value
Tatum Points 27.2
Tatum Rebounds 5.9
Durant Rebounds 6.1
Durant Blocks 1.2
Lillard Assists 7.1
Cunningham Points 25.7
Cunningham Rebounds 6.1
Cunningham Assists 9.2
Wider dataframe
Player Points Rebounds Assists Blocks
Tatum 27.2 5.9 NA NA
Durant NA 6.1 NA 1.2
Lillard NA NA 7.1 NA
Cunningham 25.7 6.1 9.2 NA

Stand Table

Step 3, use pivot_wider() to build the stand and stock tables.

# Stand table.
stand_estimates %>% 
  pivot_wider(id_cols = scientific_name, 
              names_from = DBH_4in, 
              values_from = y_bar_trees) # trees/ac
# A tibble: 3 × 4
# Groups:   scientific_name [3]
  scientific_name   `[6,10]` `(10,14]` `(14,18]`
  <chr>                <dbl>     <dbl>     <dbl>
1 Abies balsamea        0         8.02       0  
2 Betula papyrifera     0         0         16.0
3 Pinus strobus         8.02     16.0        0  

Stand Table

DBH Classes (in)
Species [6,10] (10,14] (14, 18]
Abies balsamea 0 8 0
Betula papyrifera 0 0 16
Pinus strobus 8 16 0

Values are numbers of trees per acre.

Stock Table

# Stock table.
stand_estimates %>% 
  pivot_wider(id_cols = scientific_name, 
              names_from = DBH_4in, 
              values_from = y_bar_vol) # vol/ac
# A tibble: 3 × 4
# Groups:   scientific_name [3]
  scientific_name   `[6,10]` `(10,14]` `(14,18]`
  <chr>                <dbl>     <dbl>     <dbl>
1 Abies balsamea          0       143.        0 
2 Betula papyrifera       0         0       563.
3 Pinus strobus         116.      376.        0 

Stock Table

DBH Classes (in)
Species [6,10] (10,14] (14, 18]
Abies balsamea 0 142.8 0
Betula papyrifera 0 0 563.3
Pinus strobus 116.3 375.5 0

Values are volume cubic feet per acre.