detectree2.models package
Submodules
detectree2.models.evaluation module
Evaluate model performance.
Classes and functions to evaluate model performances.
- class detectree2.models.evaluation.Feature(filename, directory, number, feature, lidar_filename, lidar_img, EPSG)
Bases:
object
Feature class to store.
Longer class information.
- get_tuple_coords(coords)
Converts coordinates’ data structure from a list of lists to a list of tuples.
- poly_area()
Calculates the area of the feature from scaled geojson.
- tree_height()
Crops the lidar tif to the features and calculates height.
Calculates the median height to account for error at the boundaries. If no lidar file is inputted than the height is given as 0
- class detectree2.models.evaluation.GeoFeature(filename, directory, number, feature, lidar_img, EPSG)
Bases:
object
Feature class to store.
Longer class information.
- get_tuple_coords(coords)
Converts coordinates’ data structure from a list of lists to a list of tuples.
- poly_area()
Calculates the area of the feature from scaled geojson.
- tree_height()
Crops the lidar tif to the features and calculates height
Calculates the median height to account for error at the boundaries. If no lidar file is inputted than the height is given as 0
- detectree2.models.evaluation.f1_cal(precision, recall)
Calculate the F1 score.
- detectree2.models.evaluation.feat_threshold_tests(feature_instance, conf_threshold, area_threshold, border_filter, tile_width)
Tests completed to see if a feature should be considered valid.
Checks if the feature is above the confidence threshold if there is a confidence score available (only applies in predicted crown case). Filters out features with areas too small which are often crowns that are from an adjacent tile that have a bit spilt over. Removes features within a border of the edge, border size is given by border_filter proportion of the tile width.
- detectree2.models.evaluation.feat_threshold_tests2(feature_instance, conf_threshold, area_threshold, border_filter, tile_width, tile_origin)
Tests completed to see if a feature should be considered valid.
Checks if the feature is above the confidence threshold if there is a confidence score available (only applies in predicted crown case). Filters out features with areas too small which are often crowns that are from an adjacent tile that have a bit spilt over. Removes features within a border of the edge, border size is given by border_filter proportion of the tile width.
- detectree2.models.evaluation.feats_height_filt(all_feats, min_height, max_height)
Stores the numbers of all the features between min and max height.
- detectree2.models.evaluation.find_intersections(all_test_feats, all_pred_feats)
Finds the greatest intersection between predicted and manual crowns and then updates objects.
- detectree2.models.evaluation.get_epsg(file)
Splitting up the file name to get EPSG
- detectree2.models.evaluation.get_heights(all_feats, min_height, max_height)
Find the heights of the trees
- detectree2.models.evaluation.get_tile_origin(file)
Splitting up the file name to get tile origin
- detectree2.models.evaluation.get_tile_width(file)
Split up the file name to get width and buffer then adding to get overall width.
- detectree2.models.evaluation.initialise_feats(directory, file, lidar_filename, lidar_img, area_threshold, conf_threshold, border_filter, tile_width, EPSG)
Creates a list of all the features as objects of the class.
- detectree2.models.evaluation.initialise_feats2(directory, file, lidar_img, area_threshold, conf_threshold, border_filter, tile_width, tile_origin, epsg)
Creates a list of all the features as objects of the class.
- detectree2.models.evaluation.positives_test(all_test_feats, all_pred_feats, min_IoU, min_height, max_height)
Determines number of true postives, false positives and false negatives.
Store the numbers of all test features which have true positives arise.
- detectree2.models.evaluation.prec_recall(total_tps: int, total_fps: int, total_fns: int)
Calculate the precision and recall by standard formulas.
- detectree2.models.evaluation.save_feats(tile_directory, all_feats)
Collating all the information for the features back into a geojson to save.
- detectree2.models.evaluation.site_f1_score(tile_directory=None, test_directory=None, pred_directory=None, lidar_img=None, IoU_threshold=0.5, height_threshold=0, area_fraction_limit=0.0005, conf_threshold=0, border_filter=<class 'tuple'>, scaling=<class 'list'>, EPSG=None, save=False)
Calculating all the intersections of shapes in a pair of files and the area of the corresponding polygons.
- Args:
tile_directory: path to the folder containing all of the tiles test_directory: path to the folder containing just the test files pred_directory: path to the folder containing the predictions and the reprojections lidar_img: path to the lidar image of an entire region IoU_threshold: minimum value of IoU such that the intersection can be considered a true positive height_threshold: minimum height of the features to be considered area_fraction_limit: proportion of the tile for which crowns with areas less than this will be ignored conf_threshold: minimun confidence of a predicted feature so that it is considered border_filter: bool of whether to remove border crowns, proportion of border to be used in relation to tile size scaling: x and y scaling used when tiling the image EPSG: area code of tree location save: bool to tell program whether the filtered crowns should be saved
- detectree2.models.evaluation.site_f1_score2(tile_directory=None, test_directory=None, pred_directory=None, lidar_img=None, IoU_threshold=0.5, min_height=0, max_height=100, area_threshold=25, conf_threshold=0, border_filter=<class 'tuple'>, save=False)
Calculating all the intersections of shapes in a pair of files and the area of the corresponding polygons.
- Args:
tile_directory: path to the folder containing all of the tiles test_directory: path to the folder containing just the test files pred_directory: path to the folder containing the predictions and the reprojections lidar_img: path to the lidar image of an entire region IoU_threshold: minimum value of IoU such that the intersection can be considered a true positive min_height: minimum height of the features to be considered max_height: minimum height of the features to be considered area_threshold: min crown area to consider in m^2 conf_threshold: minimun confidence of a predicted feature so that it is considered border_filter: bool to remove border crowns, buffer in from border to be used (in m) in relation to tile size scaling: x and y scaling used when tiling the image save: bool to tell program whether the filtered crowns should be saved
detectree2.models.models module
detectree2.models.outputs module
Process and clean predictions.
Funtions to process model predictions into outputs for model evaluation and mapping crowns in geographic space.
- detectree2.models.outputs.average_polygons(polygons, weights=None, num_points=300)
Average a set of polygons.
- detectree2.models.outputs.box_filter(filename, shift: int = 0)
Create a bounding box from a file name to filter edge crowns.
- Args:
filename: Name of the file. shift: Number of meters to shift the size of the bounding box in by. This is to avoid edge crowns.
- Returns:
gpd.GeoDataFrame: A GeoDataFrame containing the bounding box.
- detectree2.models.outputs.box_make(minx: int, miny: int, width: int, buffer: int, crs, shift: int = 0)
Generate bounding box from geographic specifications.
- Args:
minx: Minimum x coordinate. miny: Minimum y coordinate. width: Width of the tile. buffer: Buffer around the tile. crs: Coordinate reference system. shift: Number of meters to shift the size of the bounding box in by. This is to avoid edge crowns.
- Returns:
gpd.GeoDataFrame: A GeoDataFrame containing the bounding box.
- detectree2.models.outputs.calc_iou(shape1, shape2)
Calculate the IoU of two shapes.
- detectree2.models.outputs.clean_crowns(crowns: GeoDataFrame, iou_threshold: float = 0.7, confidence: float = 0.2, area_threshold: float = 2, field: str = 'Confidence_score') GeoDataFrame
Clean overlapping crowns.
Outputs can contain highly overlapping crowns including in the buffer region. This function removes crowns with a high degree of overlap with others but a lower Confidence Score.
- Args:
crowns (gpd.GeoDataFrame): Crowns to be cleaned. iou_threshold (float, optional): IoU threshold that determines whether crowns are overlapping. confidence (float, optional): Minimum confidence score for crowns to be retained. Defaults to 0.2. Note that
this should be adjusted to fit “field”.
area_threshold (float, optional): Minimum area of crowns to be retained. Defaults to 1m2 (assuming UTM). field (str): Field to used to prioritise selection of crowns. Defaults to “Confidence_score” but this should
be changed to “Area” if using a model that outputs area.
- Returns:
gpd.GeoDataFrame: Cleaned crowns.
- detectree2.models.outputs.clean_outputs(crowns: GeoDataFrame, iou_threshold=0.7)
Clean predictions prior to accuracy assessment.
Outputs can contain highly overlapping crowns including in the buffer region. This function removes crowns with a high degree of overlap with others but a lower Confidence Score.
- detectree2.models.outputs.clean_predictions(directory, iou_threshold=0.7)
Clean predictions prior to accuracy assessment.
- detectree2.models.outputs.combine_and_average_polygons(gdfs, iou=0.9)
Combine and average polygons.
- detectree2.models.outputs.filename_geoinfo(filename)
Return geographic info of a tile from its filename.
- detectree2.models.outputs.load_geopandas_dataframes(folder)
Load all GeoPackage files in a folder into a list of GeoDataFrames.
- detectree2.models.outputs.normalize_polygon(polygon, num_points)
Normalize a polygon to a set number of points.
- detectree2.models.outputs.polygon_from_mask(masked_arr)
Convert RLE data from the output instances into Polygons.
Leads to a small about of data loss but does not affect performance? https://github.com/hazirbas/coco-json-converter/blob/master/generate_coco_json.py <– adapted from here
- detectree2.models.outputs.post_clean(unclean_df: GeoDataFrame, clean_df: GeoDataFrame, iou_threshold: float = 0.3, field: str = 'Confidence_score') GeoDataFrame
Fill in the gaps left by clean_crowns.
- Args:
unclean_df (gpd.GeoDataFrame): Unclean crowns. clean_df (gpd.GeoDataFrame): Clean crowns. iou_threshold (float, optional): IoU threshold that determines whether predictions are considered overlapping. crowns are overlapping. Defaults to 0.3.
- detectree2.models.outputs.project_to_geojson(tiles_path, pred_fold=None, output_fold=None, multi_class: bool = False)
Projects json predictions back in geographic space.
Takes a json and changes it to a geojson so it can overlay with orthomosaic. Another copy is produced to overlay with PNGs.
- Args:
tiles_path (str): Path to the tiles folder. pred_fold (str): Path to the predictions folder. output_fold (str): Path to the output folder.
- Returns:
None
- detectree2.models.outputs.stitch_crowns(folder: str, shift: int = 1)
Stitch together predicted crowns.
- Args:
folder: Path to folder containing geojson files. shift: Number of meters to shift the size of the bounding box in by. This is to avoid edge crowns.
- Returns:
gpd.GeoDataFrame: A GeoDataFrame containing all the crowns.
- detectree2.models.outputs.to_eval_geojson(directory=None)
Converts predicted jsons to a geojson for evaluation (not mapping!).
Reproject the crowns to overlay with the cropped crowns and cropped pngs. Another copy is produced to overlay with pngs.
detectree2.models.predict module
Generate predictions.
This module contains the code to generate predictions on tiled data.
- detectree2.models.predict.predict_on_data(directory: str = './', out_folder: str = 'predictions', predictor=<class 'detectron2.engine.defaults.DefaultPredictor'>, eval=False, save: bool = True, num_predictions=0)
Make predictions on tiled data.
Predicts crowns for all png images present in a directory and outputs masks as jsons.
detectree2.models.test module
detectree2.models.train module
Train a model.
Classes and functions to train a model based on othomosaics and corresponding manual crown data.
- class detectree2.models.train.FlexibleDatasetMapper(cfg, is_train=True, augmentations=None)
Bases:
DatasetMapper
A flexible dataset mapper that extends the standard DatasetMapper to handle multi-band images and custom augmentations.
This class is designed to work with datasets that may contain images with more than three channels (e.g., multispectral images) and allows for custom augmentations to be applied. It also handles semantic segmentation data if provided in the dataset.
- Args:
cfg (CfgNode): Configuration object containing dataset and model configurations. is_train (bool): Flag indicating whether the mapper is being used for training. Default is True. augmentations (list, optional): List of augmentations to be applied. Default is an empty list.
- Attributes:
cfg (CfgNode): Stores the configuration object for later use. is_train (bool): Indicates whether the mapper is in training mode. logger (Logger): Logger instance for logging messages.
- class detectree2.models.train.LossEvalHook(eval_period, model, data_loader, patience)
Bases:
HookBase
A custom hook for evaluating loss during training and managing model checkpoints based on evaluation metrics.
This hook is designed to: - Perform inference on a dataset similarly to an Evaluator. - Calculate and log the loss metric during training. - Save the best model checkpoint based on a specified evaluation metric (e.g., AP50). - Implement early stopping if the evaluation metric does not improve over a specified number of evaluations.
- Attributes:
_model: The model to evaluate. _period: Number of iterations between evaluations. _data_loader: The data loader used for evaluation. patience: Number of evaluation periods to wait before early stopping. iter: Tracks the number of evaluations since the last improvement in the evaluation metric. max_ap: The best evaluation metric (e.g., AP50) achieved during training. best_iter: The iteration at which the best evaluation metric was achieved.
- after_step()
Hook to be called after each training iteration to evaluate the model and manage checkpoints.
Evaluates the model at regular intervals.
Saves the best model checkpoint based on the AP50 metric.
Implements early stopping if the AP50 does not improve after a set number of evaluations.
- after_train()
Hook to be called after training is complete to load the best model checkpoint based on AP50.
Selects and loads the model checkpoint with the best AP50.
- class detectree2.models.train.MyTrainer(cfg, patience)
Bases:
DefaultTrainer
Custom Trainer class that extends the DefaultTrainer.
This trainer adds flexibility for handling different image types (e.g., RGB and multi-band images) and custom training behavior, such as early stopping and specialized data augmentation strategies.
- Args:
cfg (CfgNode): Configuration object containing the model and dataset configurations. patience (int): Number of evaluation periods to wait for improvement before early stopping.
- classmethod build_evaluator(cfg, dataset_name, output_folder=None)
Build the evaluator for the model.
- Args:
cfg (CfgNode): Configuration object. dataset_name (str): Name of the dataset to evaluate. output_folder (str, optional): Directory to save evaluation results. Defaults to “eval”.
- Returns:
COCOEvaluator: An evaluator for COCO-style datasets.
- build_hooks()
Build the training hooks, including the custom LossEvalHook.
This method adds a custom hook for evaluating the model’s loss during training, with support for early stopping based on the AP50 metric.
- Returns:
list: A list of hooks to be used during training.
- classmethod build_test_loader(cfg, dataset_name)
Build the test data loader.
This method configures the data loader for evaluation, using the FlexibleDatasetMapper to handle custom augmentations and image types.
- Args:
cfg (CfgNode): Configuration object. dataset_name (str): Name of the dataset to load for testing.
- Returns:
DataLoader: A data loader for the test dataset.
- classmethod build_train_loader(cfg)
Build the training data loader with support for custom augmentations and image types.
This method configures the data loader to apply specific augmentations depending on the image mode (RGB or multi-band) and resize strategy defined in the configuration.
- Args:
cfg (CfgNode): Configuration object.
- Returns:
DataLoader: A data loader for the training dataset.
- train()
Run the training loop.
This method overrides the DefaultTrainer’s train method to include early stopping and custom logging of Average Precision (AP) metrics.
- Returns:
OrderedDict: Results from evaluation, if evaluation is enabled. Otherwise, None.
- detectree2.models.train.combine_dicts(root_dir: str, val_dir: int, mode: str = 'train', class_mapping: Dict[str, int] | None = None) List[Dict[str, Any]]
Combine dictionaries from different directories based on the specified mode.
This function aggregates tree dictionaries from multiple directories within a root directory. Depending on the mode, it either combines dictionaries from all directories, all except a specified validation directory, or only from the validation directory.
- Args:
root_dir (str): The root directory containing subdirectories with tree dictionaries. val_dir (int): The index (1-based) of the validation directory to exclude or use depending on the mode. mode (str, optional): The mode of operation. Can be “train”, “val”, or “full”.
“train” excludes the validation directory, “val” includes only the validation directory, and “full” includes all directories. Defaults to “train”.
class_mapping: A dictionary mapping class labels to category indices (optional).
- Returns:
List of combined dictionaries from the specified directories.
- detectree2.models.train.get_classes(out_dir)
Function that will read the classes that are recorded during tiling.
- Args:
out_dir: directory where classes.txt is located
- Returns:
list of classes
- detectree2.models.train.get_filenames(directory: str)
Get the file names if no geojson is present.
Allows for predictions where no delinations have been manually produced.
- Args:
directory (str): directory of images to be predicted on
- detectree2.models.train.get_latest_model_path(output_dir: str) str
Find the model file with the highest index in the specified output directory.
- Args:
output_dir (str): The directory where the model files are stored.
- Returns:
str: The path to the model file with the highest index.
- detectree2.models.train.get_tree_dicts(directory: str, class_mapping: Dict[str, int] | None = None) List[Dict[str, Any]]
Get the tree dictionaries.
- Args:
directory: Path to directory classes: List of classes to include classes_at: Signifies which column (if any) corresponds to the class labels
- Returns:
List of dictionaries corresponding to segmentations of trees. Each dictionary includes bounding box around tree and points tracing a polygon around a tree.
- detectree2.models.train.load_json_arr(json_path)
Load json array.
- Args:
json_path: path to json file
- detectree2.models.train.modify_conv1_weights(model, num_input_channels)
Modify the weights of the first convolutional layer (conv1) to accommodate a different number of input channels.
This function adjusts the weights of the conv1 layer in the model’s backbone to support a custom number of input channels. It creates a new weight tensor with the desired number of input channels, and initializes it by repeating the weights of the original channels.
- Args:
model (torch.nn.Module): The model containing the convolutional layer to modify. num_input_channels (int): The number of input channels for the new conv1 layer.
- detectree2.models.train.predictions_on_data(directory=None, predictor=<class 'detectron2.engine.defaults.DefaultTrainer'>, trees_metadata=None, save=True, scale=1, geos_exist=True, num_predictions=0)
Prediction produced from a test folder and outputted to predictions folder.
- Args:
directory: directory containing test data predictor: predictor object trees_metadata: metadata for trees save: boolean to save predictions scale: scale of image geos_exist: boolean to determine if geojson files exist num_predictions: number of predictions to make
- detectree2.models.train.register_test_data(test_location, name='tree')
Register data for testing.
- Args:
test_location: directory containing test data name: string to name data
- detectree2.models.train.register_train_data(train_location, name: str = 'tree', val_fold=None, class_mapping_file=None)
Register data for training and (optionally) validation.
- Args:
train_location: Directory containing training folds. name: Name to register the dataset. val_fold: Validation fold index (optional). class_mapping_file: Path to the class mapping file (json or pickle).
- detectree2.models.train.remove_registered_data(name='tree')
Remove registered data from catalog.
- Args:
name: string of named registered data
- detectree2.models.train.setup_cfg(base_model: str = 'COCO-InstanceSegmentation/mask_rcnn_R_101_FPN_3x.yaml', trains=('trees_train',), tests=('trees_val',), update_model=None, workers=2, ims_per_batch=2, gamma=0.1, backbone_freeze=3, warm_iter=120, momentum=0.9, batch_size_per_im=1024, base_lr=0.0003389, weight_decay=0.001, max_iter=1000, eval_period=100, out_dir='./train_outputs', resize='fixed', imgmode='rgb', num_bands=3, class_mapping_file=None)
Set up config object # noqa: D417.
- Args:
base_model: base pre-trained model from detectron2 model_zoo trains: names of registered data to use for training tests: names of registered data to use for evaluating models update_model: updated pre-trained model from detectree2 model_garden workers: number of workers for dataloader ims_per_batch: number of images per batch gamma: gamma for learning rate scheduler backbone_freeze: backbone layer to freeze warm_iter: number of iterations for warmup momentum: momentum for optimizer batch_size_per_im: batch size per image base_lr: base learning rate weight_decay: weight decay for optimizer max_iter: maximum number of iterations num_classes: number of classes eval_period: number of iterations between evaluations out_dir: directory to save outputs resize: resize strategy for images imgmode: image mode (rgb or multispectral) num_bands: number of bands in the image class_mapping_file: path to class mapping file