Forest ecosystems represent the largest terrestrial carbon (C) sink on earth [1, 2], such that the United Nations Framework Convention on Climate Change  has recognized their management as an effective strategy for offsetting greenhouse gas (GHG) emissions [4, 5]. As part of the Convention, the U.S. has been submitting national reports, the National Greenhouse Gas Inventory (NGHGI), detailing emissions and removals of GHGs  on an annual basis for many years . In addition to international reporting requirements, GHG budgets are being developed at sub-national scales including states (e.g., California) and ownerships (e.g, National Forest System climate change scorecard). Forest C stocks in the U.S. are estimated using data from the national forest inventory conducted by the USDA Forest Service, Forest Inventory and Analysis (FIA) program . Broad forest ecosystem components (e.g., aboveground live biomass) have been delineated to generalize C stocks to meet international reporting agreements pursuant to refining understanding of global carbon cycling [2, 3]. Carbon estimates for the ecosystem components of forest floor (inclusive of litter, fine woody debris, and humic soil horizons), down dead wood, belowground (BG) biomass, and soil organic matter are calculated by FIA using models based on geographic area, forest type, and, in some cases, stand age [6, 8]. Estimates of aboveground (AG) standing live and dead tree C stocks are based on biomass estimates obtained from inventory tree data [6, 9]. Although forest C stock estimates, such as those from FIA, are readily available at national and regional scales [6, 7], there is increasing interest in disaggregating these large-scale numerical estimates into maps of continuous estimates to enable strategic forest management and monitoring activities geared toward offsetting GHG emissions  and advancing C dynamics research.
Secondary to the need for spatially continuous forest C maps, numerous constituents (e.g., managers, policy makers, and scientists, forest analysts) require an efficient methodology for incorporating annual monitoring information into C maps. Sophisticated approaches to mapping forest C stocks may provide robust estimates of stocks , but lack the flexibility to rapidly incorporate annual monitoring information. As numerous forest C pools may change on annual time steps, especially in response to stochastic disturbance events, temporal accuracy of C maps may often be of equal importance as the need for spatial accuracy. Woodall et al.  found that actual standing dead tree C stocks were often significantly different than those modeled for the same inventory plots. Despite the measurement/model error associated with annual forest inventory programs, the temporally dynamic nature of forest ecosystems (e.g., wildfires and wind events) necessitates the incorporation of annual data into map products employed by scientists and stakeholders alike.
Wilson et al.  developed a methodology (hereafter referred to as Phenological Gradient Nearest Neighbor, or PGNN, for convenience) for producing maps of tree species occurrence and relative abundance over large areas by utilizing information collected on FIA field plots in conjunction with 250 m pixel resolution raster data in a k-nearest neighbor (k NN) imputation framework. The PGNN approach builds upon the Gradient Nearest Neighbor (GNN) work of Ohmann and Gregory , who integrated nearest-neighbor imputation of FIA plots with ecological ordination via canonical correspondence analysis (CCA). PGNN is best described as a hybrid of the k NN and GNN approaches, since it also makes use of CCA but utilizes k nearest neighbors during imputation rather than only a single neighbor. Another distinguishing characteristic is that it utilizes vegetation phenology information derived from multi-temporal satellite imagery, as well as climate, topographic, and ecoregion data compiled at a 250 m pixel resolution. One of the most attractive features of this approach is the efficiency with which a plot identification map can be produced at the national scale. In other words, every pixel is assigned a forest inventory plot label as well as the attributes of the labeled plot’s nearest neighbors, as defined by the CCA model. In the case of forest C accounting, every pixel could be assigned C stock estimates in a rapid fashion on an annual time-step.
Given the need for C maps at the national scale and the possible application of PGNN, the goal of this study was to apply PGNN for imputing national forest inventory plots to a spatially continuous raster grid in order to produce mapped estimates of the conterminous U.S.’s forest C density with these specific objectives: 1) to produce and interpret maps of forest carbon density by individual pools and combinations thereof (total forest ecosystem C density, live tree AG, live tree BG, live understory AG and BG, standing dead tree AG, downed dead wood, forest floor, soil organic carbon, and the pool that has the highest proportion of total forest ecosystem C density); 2) to conduct validation of the C mapping approach by comparing map-based and field plot-based estimates using a variety of metrics; and 3) to suggest future research directions and applications.