### 4.2.1 Shapefiles, one shapefile for each class. Each shapefile should be named with the name of the desired output class, followed by "valid" (e.g., watervalid.shp). valid.files <- list.files(pattern="*valid.*shp") valid.full <- NULL for (x in seq_along(valid.files)) { valid.locations <- readShapePoly(valid.files[x]) valid.predictors <- extract(img, valid.locations) ### If the data are in a point shapefile instead of polygons, skip this next line. valid.predictors <- as.data.frame(do.call(rbind, valid.predictors)) valid.by.class <- cbind(predicotrs.valid, "types" = substr(validdat[x], 1, nchar(validdat[x])-9)) valid.full <- rbind(valid.full, valid.by.class) } ### 4.2.2 One shapefile with multiple classes. This shapefile must have at least one field that identifies the classes (in the code below, the field is called "types"). The shapefile can be of any feature type, but different code must be used for points versus any other type. The comments below indicate which code to use for points and which to use for other types. valid.locations <- readShapePoly("shapefile") valid.full <- NULL for (i in 1:length(unique(valid.locations[["types"]]))){ valid.type <- unique(valid.locations[["types"]])[i] valid.type.locations <- valid.locations [valid.locations[["types"]] == valid.type,] valid.by.class <- extract(img, valid.type.locations) ### If the data is in a point shapefile instead of polygons, skip this next line. valid.by.class <- as.data.frame(do.call("rbind", valid.by.class)) valid.by.class <- cbind(valid.by.class, "types" = as.character(category)) valid.full <- rbind(valid.by.class, valid.full) } ### 4.2.3 Csv files, one csv file for each class. Each csv file should be named with the name of the desired output class, followed by "valid" (e.g., forestvalid.csv) and should only contain the coordinates for the validation data, with columns headed x and y. valid.files <- list.files(pattern="*valid.*csv") valid.full <- NULL for (x in seq_along(valid.files)) { valid.locations <- read.csv(valid.files[x]) coordinates(valid.locations) <- c("x","y") valid.predictors <- as.data.frame(raster::extract(img, valid.locations)) valid.by.class <- cbind(valid.predictors, "types" = substr(valid.files[x], 1, nchar(valid.files[x])-9)) valid.full <- rbind(valid.full, valid.by.class) } ### 4.2.4 Csv file with accuracy assessment data already extracted and assembled. Data assembled outside of R that are to be used with the code in this article should have a column for the data from each band and a column headed "types" containing class names for each observation. Do not change this code to use the word "class" as the variable name for your classes, as this is a reserved work in R. valid.full <- read.csv("my-data-file.csv") ### 4.2.5 Random extraction of accuracy assessment data from imported training data. This is a common practice. The statistical appropriateness of this approach is not addressed in this article, although analysts should be aware of issues related to use of non-independent data for validating results. ### Get the number of entries (rows) in the data. x <- nrow (train.full) ### Create index numbers for extracting the training data. The example is set for 75% training, but this can be changed. train.index <- sample (1:x, (x * 0.75)) ### Create index numbers for the rest for validation. valid.index <- setdiff (1:x, train.index) ### Extract the training data to a temporary object. train.temp <- train.full[train.index,] ### Extract the validation data. valid.full <- train.full[valid.index,] ### Rename the training data to match the rest of the code. train.full <- train.temp