Mapping Whitebark Pine using Citizen Science Data

A proof-of-concept for using iNaturalist data and herbarium records to map whitebark pine.


Citizen-science data and herbarium records are important sources of information for biodiversity science and conservation. As a proof-of-concept, I conducted a pilot study using iNaturalist data and herbarium records to map whitebark pine (Pinus albicaulis) outside the published range across Canada and the United States. This pilot study is part of The Whitebark Pine Project.


To begin I downloaded all whitebark pine observations from the Global Biodiversity Information Facility (GBIF 2023) using the Darwin Core Archive option and imported the data into a PostgreSQL database. The GBIF dataset contained a total of 7,652 records, including 6,034 herbarium records and 1,618 research grade observations from iNaturalist. Next, I performed quality control (QC) review of the GBIF data and included in the analysis only the records that met the following criteria: coordinateuncertaintyinmeters <=250 AND decimallatitude IS NOT NULL AND decimallongitude IS NOT NULL AND decimallongitude < 0 AND year IS NOT NULL AND year >=2003 AND informationwithheld IS NULL. I then filtered the observations that met the above criteria and included only those records with a decimallatitude and decimallongitude precision of greater than or equal to three. Lastly, I reviewed the filtered records in a Geographic Information System (GIS) and excluded observations with locations well outside the known range of whitebark pine. In all cases these records were located in the extreme desert southwest, mid-west, and eastern U.S. or Europe, and most likely represent locations associated with the institutions at which the associated herbarium specimens are stored. A total of 1,486 observations passed the QC review which consisted of 1,436 research grade iNaturalist records and 50 herbarium records.

Figure 1. 2014 Whitebark Pine Range Map for U.S. and Canadian distributions by Smith and Collingwood (2014).

Following the QC review I wrote a database view (i.e., a saved query) that included a point geometry column created using the postgis extension and the decimallatitude and decimallongitude columns, and set the spatial reference index (SRID) to 4326 (WGS84). I then displayed the whitebark pine observation points in ESRI ArcPro 2.9. I overlaid the points on the 2014 whitebark pine range map for U.S. and Canadian distributions by Smith and Collingwood (2014) to assess the degree of overlap and identify observations that occurred outside the 2014 mapped range (Figure 1). I performed a spatial join between the 2014 range map and observation points and assigned each point in the database a category, either inside or outside the 2014 range map.

I calculated the shortest distance from each point the edge of the 2014 range map and assigned each point in the database this distance attribute. Lastly, for the observations outside the 2014 range I applied 2 km and 4 km buffers to each point, calculated the area of each resulting buffer, and then calculated the total area of all buffers to determine a rough order-of-magnitude (ROM) estimate of the amount of additional whitebark pine range added by the iNaturalist and herbarium observations.

I selected the Smith and Collingwood (2014) range map instead of the newer 2019 whitebark pine range by WPEF (2019) because the later includes only the U.S. distribution, and I wanted to include both the U.S. and Canada in my analysis. Note that the analysis presented here is not intended as a critique of the Smith and Collingwood (2014) map. Their map was developed at a broad-scale (range-wide, population-level) and is not suitable for mapping and analysis at the stand-level. Rather, the analysis is intended to illustrate the utility of the citizen-science and herbarium observations as supplemental data to support future whitebark range mapping efforts, identify areas that may need additional surveys to verify the extent of whitebark pine populations, and highlight the importance of these data sources for conservation science.

Results and Discussion

Of the total 1,486 observations included in the analysis, 1,276 (85%) were located within the 2019 existing range map, and 210 (15%) were located outside the existing range map (Figure 2). Thus, the vast majority of the observations were located within the 2014 range map, a result that makes sense intuitively and also speaks to the high quality of the 2014 range map. Of the 210 observations located outside the 2014 range map, 195 (93%) were observed between 2014—2023, and 15% were observed 2003—2013. Thus, the vast majority of observations are concurrent or more recent than the 2014 range map.

Figure 2. GBIF whitebark pine observations and the 2014 range map from Smith and Collingwood (2014), western U.S. and Canada. The black boxes are the detailed view areas depicted in Figures 3 (north box) and 4 (south box).

Of the 210 observations located outside the 2014 range map, 58 (28%) were within 2 km (1.2 mi) of the existing range. The average distance was 9.7 +/- 13.0 km (6.0 +/- 8.1 mi), and the 50th, 10th, and 90th percentiles of distance were 4.8 km (3.0 mi), 0.4 km (.3 mi), and 25.2 km (15.7 mi), respectively. The 50% of observations located <5 km from the 2014 range are likely related to 1) the difference in spatial scales represented in the 2014 range map (range-wide, population scale) versus the observation data (stand-scale) and the minimum map unit size of the 2014 range map, i.e., populations too

small in areal extent to map based on the map scale. Researchers seeking to use the 2014 range map to prioritize conservation efforts or direct field surveys for whitebark pine may want to consider adding a 2—5 km buffer on the range map to capture whitebark stands directly adjacent to the known range. The remaining 50% of the observations were located >5 km and may offer some data for directing future survey efforts and updated range maps.

For example, four iNaturalist observations of whitebark pine in 2016 in the vicinity of Mount Baker, WA are located approximately 47 km (29 mi) outside the 2014 range map (Figure 3). These stands may be too small to map at a broad scale; however they represent a significant range extension to the west in the Cascade Mountains. Other noteable examples of areas with several whitebark pine observations >5 km from the 2014 range is the Lake Tahoe, CA area (Figure 4) and south of Whistler, BC near Garibaldi Lake (not shown). These are just a few of a number of areas highlighted by this analysis.

The total area of whitebark pine range from the 2014 range map is 293,607 km2 (113,362 mi2), an area roughly the size of the state of Arizona. To determine a ROM estimate of the amount of additional whitebark pine range added by the iNaturalist and herbarium observations I applied 2 km and 4 km buffers to each point. The total area added based on the buffers was between 1,389—4,869 km2 (536—1,880 mi2) which equates to 0.5–1.7% of the area of the 2014 range map. While the proportional area of the estimated increase in range is small, the absolute estimated is significant from a conservation standpoint, with the upper end (4,869 km2) equating to an area approximately the size of Glacier National Park.

Challenges and Limitations

Above I illustrated the importance of citizen-science and herbarium data for mapping and conservation of whitebark pine. The data are freely available and crowd-sourced, and with appropriate QC review can provide high quality field observations for analysis. However, caution should be exercised when using iNaturalist observations, even those that are Research Grade. For instance, the photos for one of the observation near Mount Baker discussed above are displayed in Figure 3. The photos clearly show the cones of whitebark pine, and I have high confidence in the identification as whitebark pine. The iNaturalist observation with the greatest distance (>100 km) from the 2014 range is located on Mount Harrison in the Albion Mountains of southern Idaho (Figure 1, black arrow). While the observation is research grade, there is only one photo which shows the tree at a distance with no needle, cone, or cone fragment photos. My confidence in this identification is lower given the great distance from the known range, lack of needle and cone/cone fragment photos, and the fact that limber pine (Pinus flexilis) also occurs in this area and is superficially similar to whitebark pine. Therefore, I would not rely on this observation alone to extend the range further south in Idaho without additional surveys or iNaturalist records with complete photos (entire tree, branches, needles, cone or cone fragments).

Another challenge with using citizen-science data is that the observations are unstructured, meaning that they don’t follow a statistically rigorous study design, and they are often skewed spatially towards roads and trails (i.e., easy access areas). Lastly, research grade iNaturalist observations aren’t without misidentifications, and only further review by experts on iNaturalist can help find those errors and correct them. This is the reason why I started the iNaturalist projects: The Whitebark Pine Project and the Whitebark Pine Outside the Published Range Project, and why my colleague Michael Kaufmann started the 5-Needle Pines Along the Pacific Crest Trail iNaturalist project. I also provide guidelines for documenting whitebark pine observations on The Whitebark Pine Project website, and an identification guide to the 5-needle pines of western North America.

Next Steps & Call to Action

The purpose of this pilot study was to demonstrate the utility of citizen-science and iNaturalist data for biodiversity science and conservation. These data are an important supplement to resource inventories by government agencies and conservation groups. The next steps for this project are to obtain additional herbarium record data, for instance from the Consortium of Pacific Northwest Herbaria (this website is currently down as of the writing of this article on January 22, 2023). Using the additional herbarium data, and any new research grade iNaturalist records, I will repeat the analysis discussed above. The data will be submitted to the Whitebark Pine Ecosystem Foundation for use in future whitebark pine range map updates.

To end I’d like to encourage everyone interested in the ecology and conservation of whitebark pine to get involved with citizen-science and record your observations of whitebark through iNaturalist. Please use the identification guide and the guidelines for documenting whitebark pine to improve the quality of your iNaturalist records and to make accurate identifications. In addition to observation data we also need people to volunteer time to make identifications on iNaturalist. You can view your observations, provide identifications, and follow project blog posts at the iNaturalist projects mentioned above.

Literature Cited

Global Biodiversity Information Facility (GBIF). 2023. GBIF Occurrence Download Downloaded on 2023-01-06.

Smith, C. and A., Collingwood. 2014 Whitebark Range Map for U.S. and Canadian Distributions. Parks Canada, Waterton Lakes National Park, Alberta. Available at: (Accessed 2023-01-20)

Whitebark Pine Ecosystem Foundation (WPEF). 2019. 2019 Whitebark Pine Distribution Map. Prepared by the Whitebark Pine Ecosytem Foundation, Missoula, Montana. Available at:

Stay Informed

Contact us and stay informed about The Whitebark Pine Project.

%d bloggers like this: