---
title: "Analysis of NEON Woody plant vegetation structure data"
subtitle: "ACCE DTP course"
author: "Anna Krystalli"
date: "2024-03-19"
format:
html:
toc: true
theme: minty
highlight-style: dracula
df-print: paged
editor: visual
bibliography: data-raw/wood-survey-data-master/references.bib
---
## Background
![](data-raw/wood-survey-data-master/neon-logo.png){width="200"}
The [NEON Woody plant vegetation structure dataset](https://data.neonscience.org/data-products/DP1.10098.001) [@DP1.10098.001/provisional] contains **structure measurements, including height, canopy diameter, and stem diameter, as well as mapped position of individual woody plants across the survey area.**
This data product contains the quality-controlled, native sampling resolution data from in-situ measurements of live and standing dead woody individuals and shrub groups, from all terrestrial NEON sites with qualifying woody vegetation. With some modifications, this protocol adopts guidelines established by the @forestry2012 for measuring tree species. The exact measurements collected per individual depend on growth form, and these measurements are focused on enabling biomass and productivity estimation, estimation of shrub volume and biomass, and calibration / validation of multiple NEON airborne remote-sensing data products.
Our analyses focus on the **relationship between individual stem height and diameter** and how that relationship **varies across growth forms**.
### Data Preparation
Data was prepared for analysis by:
- Compiling all individual raw data files into a single table.
- Merging individual data with plot level data and geolocating individuals.
The data preparation steps are contained in the `data-raw/individuals.R` script.
## Summary statistics
Prepared data were also subset to columns of interest `stem_diameter`, `height` and `growth_form` and rows filtered to complete cases. Liana growth forms were removed.
```{r}
#| label: setup
#| code-fold: true
#| message: false
## Setup ----
# Load libraries
library(dplyr)
library(ggplot2)
# Load data
individual <- readr::read_csv(
here::here("data", "individual.csv")
) %>%
select(stem_diameter, height, growth_form)
## Subset analysis data ----
analysis_df <- individual %>%
filter(complete.cases(.), growth_form != "liana")
## Order growth form levels
gf_levels <- table(analysis_df$growth_form) %>%
sort() %>%
names()
analysis_df <- analysis_df %>%
mutate(growth_form = factor(growth_form,
levels = gf_levels
))
```
The final data set contains a total of **`{r} nrow(analysis_df)`**
```{r}
#| echo: false
#| label: tbl-print
analysis_df
```
```{r}
#| echo: false
#| label: fig-growth-form-counts
#| fig-cap: "Distribution of individual counts across growth forms."
analysis_df %>%
ggplot(aes(
y = growth_form, colour = growth_form,
fill = growth_form
)) +
geom_bar(alpha = 0.5, show.legend = FALSE)
```
@fig-growth-form-counts shows the distribution of individual counts across growth forms in the dataset.
```{r}
#| echo: false
#| label: fig-violin-plots
#| fig-cap: "Distribution of log stem_diameter and log height across growth forms"
analysis_df %>%
tidyr::pivot_longer(
cols = c(stem_diameter, height),
names_to = "var",
values_to = "value"
) %>%
ggplot(aes(
x = log(value), y = growth_form,
colour = growth_form, fill = growth_form
)) +
geom_violin(alpha = 0.5, trim = TRUE, show.legend = FALSE) +
geom_boxplot(alpha = 0.7, show.legend = FALSE) +
facet_grid(~var) +
theme_linedraw()
```
@fig-violin-plots shows the log distribution of stem diameter and log height across growth forms.
# Analysis
## Modelling overall `stem_diameter` as a function of `height`
Initially we fit a linear model of form `log(stem_diameter)` as a function of `log(height)`
```{r}
lm_overall <- lm(
log(stem_diameter) ~ log(height),
analysis_df
)
```
```{r}
#| echo: false
#| tbl-cap: "Overall model evaluation"
#| label: tbl-overall-glance
library(gt)
lm_overall |>
broom::glance() |>
gt() |>
fmt_number(decimals = 2)
```
```{r}
#| echo: false
#| tbl-cap: "Overall model coefficents"
#| label: tbl-overall-tidy
library(gt)
lm_overall |> broom::tidy() |>
gt() |>
fmt_number(decimals = 4) |>
tab_style_body(
columns = "p.value",
style = cell_text(weight = "bold"),
fn = function(x) x < 0.05
)
```
```{r}
#| echo: false
#| fig-cap: "Relationship between stem diameter and height across all data."
#| label: fig-overall-lm
analysis_df %>%
ggplot(aes(x = log(height), y = log(stem_diameter))) +
geom_point(alpha = 0.2) +
geom_smooth(method = "lm") +
xlab("Log of height (m)") +
ylab("Log of stem diameter (cm)") +
theme_linedraw()
```
See @fig-overall-lm, @tbl-overall-glance and @tbl-overall-tidy for results.
## Growth form level analysis
We also fit a model with an interaction term with growth form included.
```{r}
lm_growth <- lm(
log(stem_diameter) ~ log(height) * growth_form,
analysis_df
)
```
```{r}
#| echo: false
#| tbl-cap: "Growth Form interaction model evaluation"
#| label: tbl-growth-glance
library(gt)
lm_growth |> broom::glance() |> gt() |>
fmt_number(decimals = 2)
```
```{r}
#| echo: false
#| tbl-cap: "Growth Form interaction model coefficents"
#| label: tbl-growth-tidy
library(gt)
lm_growth |> broom::tidy() |> gt() |>
fmt_number(decimals = 4) |>
tab_style_body(
columns = "p.value",
style = cell_text(weight = "bold"),
fn = function(x) x < 0.05
)
```
```{r}
#| echo: false
#| fig-cap: "Relationship between stem diameter and height across growth forms."
#| label: fig-growth-formlm
analysis_df %>%
ggplot(aes(x = log(height), y = log(stem_diameter), colour = growth_form)) +
geom_point(alpha = 0.1) +
geom_smooth(method = "lm") +
labs(
x = "Log of height (m)",
y = "Log of stem diameter (cm)",
colour = "Growth forms"
) +
theme_linedraw()
```
See @fig-growth-formlm, @tbl-growth-glance and @tbl-growth-tidy for results.
## Summary
Our results follow findings in the literature e.g @THORNLEY1999195, @CANNELL1984299 and @Haase.
report.qmd
The report.qmd
Quarto document contains the project report.
Your file report should look a bit like:
and render to something that looks like this: