deepblink.metrics module

Functions to calculate training loss on single image.

deepblink.metrics.compute_metrics(pred: numpy.ndarray, true: numpy.ndarray, mdist: float = 3.0) → pandas.core.frame.DataFrame[source]

Calculate metric scores across cutoffs.

Parameters:
  • pred – Predicted set of coordinates.
  • true – Ground truth set of coordinates.
  • mdist – Maximum euclidean distance in px to which F1 scores will be calculated.
Returns:

DataFrame with one row per cutoff containing columns for

  • f1_score: Harmonic mean of precision and recall based on the number of coordinates
    found at different distance cutoffs (around ground truth).
  • abs_euclidean: Average euclidean distance at each cutoff.
  • offset: List of (r, c) coordinates denoting offset in pixels.
  • f1_integral: Area under curve f1_score vs. cutoffs.
  • mean_euclidean: Normalized average euclidean distance based on the total number of assignments.

deepblink.metrics.euclidean_dist(x1: float, y1: float, x2: float, y2: float) → float[source]

Return the euclidean distance between two the points (x1, y1) and (x2, y2).

deepblink.metrics.f1_integral(pred: numpy.ndarray, true: numpy.ndarray, mdist: float = 3.0, n_cutoffs: int = 50, return_raw: bool = False) → Union[float, tuple][source]

F1 integral calculation / area under F1 vs. cutoff.

Compute the area under the curve when plotting F1 score vs cutoff values. Optimal score is ~1 (floating point inaccuracy) when F1 is achieved across all cutoff values including 0.

Parameters:
  • pred – Array of shape (n, 2) for predicted coordinates.
  • true – Array of shape (n, 2) for ground truth coordinates.
  • mdist – Maximum cutoff distance to calculate F1. Defaults to None.
  • n_cutoffs – Number of intermediate cutoff steps. Defaults to 50.
  • return_raw – If True, returns f1_scores, offsets, and cutoffs. Defaults to False.
Returns:

By default returns a single value in the f1_integral score. If return_raw is True, a tuple containing: * f1_scores: The non-integrated list of F1 values for all cutoffs. * offsets: Offset in r, c on predicted coords assigned to true coords * cutoffs: A list of all cutoffs used

Notes

Scipy.spatial.distance.cdist((xa*n), (xb*n)) returns a matrix of shape (xa*xb). Here we use pred as xa and true as xb. This means that the matrix has all true coordinates along the row axis and all pred coordinates along the column axis. It’s transpose has the opposite. The linear assignment takes in a cost matrix and returns the coordinates to assigned costs which fall below a defined cutoff. This assigment takes the rows as reference and assignes columns to them. Therefore, the transpose matrix resulting in row and column coordinates named “true_pred_r” and “true_pred_c” respectively uses true (along matrix row axis) as reference and pred (along matrix column axis) as assigments. In other terms the assigned predictions that are close to ground truth coordinates. To now calculate the offsets, we can use the “true_pred” rows and columns to find the originally referenced coordinates. As mentioned, the matrix has true along its row axis and pred along its column axis. Thereby we can use basic indexing. The [0] and [1] index refer to the coordinates’ row and column value. This offset is now used two-fold. Once to plot the scatter pattern to make sure models aren’t biased in one direction and secondly to compute the euclidean distance.

The euclidean distance could not simply be summed up like with the F1 score because the different cutoffs actively influence the maximum euclidean distance score. Here, instead, we sum up all distances measured across every cutoff and then dividing by the total number of assigned coordinates. This automatically weighs models with more detections at lower cutoff scores.

deepblink.metrics.f1_score(pred: numpy.ndarray, true: numpy.ndarray) → Optional[float][source]

F1 score metric.

\[F1 = \frac{2 * precision * recall} / {precision + recall}.\]

The equally weighted average of precision and recall. The best value is 1 and the worst value is 0.

NOTE – direction dependent, arguments cant be switched!!

Parameters:
  • pred – np.ndarray of shape (n, n, 3): p, r, c format for each cell.
  • true – np.ndarray of shape (n, n, 3): p, r, c format for each cell.
deepblink.metrics.linear_sum_assignment(matrix: numpy.ndarray, cutoff: float = None) → Tuple[list, list][source]

Solve the linear sum assignment problem with a cutoff.

A problem instance is described by matrix matrix where each matrix[i, j] is the cost of matching i (worker) with j (job). The goal is to find the most optimal assignment of j to i if the given cost is below the cutoff.

Parameters:
  • matrix – Matrix containing cost/distance to assign cols to rows.
  • cutoff – Maximum cost/distance value assignments can have.
Returns:

(rows, columns) corresponding to the matching assignment.

deepblink.metrics.offset_euclidean(offset: List[tuple]) → numpy.ndarray[source]

Calculates the euclidean distance based on row_column_offsets per coordinate.

deepblink.metrics.precision_score(pred: numpy.ndarray, true: numpy.ndarray) → float[source]

Precision score metric.

Defined as tp / (tp + fp) where tp is the number of true positives and fp the number of false positives. Can be interpreted as the accuracy to not mislabel samples or how many selected items are relevant. The best value is 1 and the worst value is 0.

NOTE – direction dependent, arguments cant be switched!!

Parameters:
  • pred – np.ndarray of shape (n, n, 3): p, r, c format for each cell.
  • true – np.ndarray of shape (n, n, 3): p, r, c format for each cell.
deepblink.metrics.recall_score(pred: numpy.ndarray, true: numpy.ndarray) → float[source]

Recall score metric.

Defined as tp / (tp + fn) where tp is the number of true positives and fn the number of false negatives. Can be interpreted as the accuracy of finding positive samples or how many relevant samples were selected. The best value is 1 and the worst value is 0.

NOTE – direction dependent, arguments cant be switched!!

Parameters:
  • pred – np.ndarray of shape (n, n, 3): p, r, c format for each cell.
  • true – np.ndarray of shape (n, n, 3): p, r, c format for each cell.