The Fishing Detection Models
We have refined and replaced our models multiple times throughout the project. Below follows a summary of each model:
- Heuristic Model 1.0
- Our first model was a heuristic model based on the intuition of Figure 2 below.
- Logistic Regresssion Model 1.1
- A logistic regression model using the same features as the heuristic model, and trained using a hand labeled dataset.
- Neural Net Model 1.0
- We are currently developing a convolutional neural net (CNN) model using the same training data as the logistic model.
Background to fishing detection
The Global Fishing Watch (GFW) fishing score model computes the probability that a vessel is fishing based on its AIS track data. The combined fishing score of all vessels is used to estimate the fishing activity worldwide.
Definition of Fishing
The definition of fishing is the subject of a surprising amount of debate at GFW. We define fishing as the period when a vessel has fishing gear in the water. However, we also use a more expansive definition of fishing-related activity, which is the time that a vessel spends away from shore in which it is not transiting to and from the fishing grounds. For trawlers and longliners, these two definitions give similar results, but the same is not true for purse seiners. They can be quite different since the time they spend with gear in the water is small relative to the time spend pursuing fish.
An example of an AIS vessel track is shown in Figure 1 below, with the points where the vessel is fishing shown in red. The job of the fishing score model is to estimate the probability that a vessel is fishing at each of the points along the AIS track. The fishing score at a given point is computed using a model based on three primary features: a vessel’s average speed, the amount of variation in the vessel’s speed, and the amount of variation in the vessel’s course. The “variation” in speed and course is technically the standard deviation, and we shall refer to these features as the speed deviation and course deviation. In addition, points within 10 km of shore are uniformly considered non-fishing. Ordinary vessel behavior near shore is easily confused with fishing and using this 10 km cutoff avoids a large number of false positive fishing values.
Figure 1: Example fishing vessel AIS track with fishing shown in red
To see how this works, examine the scatter plot shown in Figure 2. This shows how these three features computed over a six hour window relate to whether a vessel is fishing. It is apparent from Figure 2 that fishing activity tends to be clustered in certain regions of average-speed, speed-deviation and course-deviation.