Context
Seals are known to produce various vocalizations for communication, particularly in noisy marine environments that may also contain other marine animals, vessel noise, or environmental interference. Unlike the consistent tonals of split-beam or depth sonar, seal calls vary widely in frequency, duration, and structure. They can be short (around 0.2 seconds) or extended multi-part calls lasting up to 10 seconds, and typically occupy a frequency band between 400 Hz and 2500 Hz. This variability, coupled with natural and anthropogenic noise interference, presents a unique classification challenge.
In this challenge, you are tasked with designing an algorithm to detect, track, and classify seal calls in underwater recordings. The data you will work with are passive acoustic recordings from hydrophones placed in areas known to host seal populations. These recordings include natural environmental sounds, potential interference from vessels, and other marine noise sources.
We anticipate that spectrogram images will serve as a primary feature representation for model inputs, but participants are also encouraged to explore the use of raw audio data as additional inputs to enhance the models’ performance and robustness.
Challenge Goals
- Detection of Seal Calls:
- Your initial task is to develop a detection algorithm for individual seal calls within a time-frequency representation of the recordings (spectrogram). The challenge here is to reliably isolate seal vocalizations from background noise and potential overlapping sources.
- Starting with a single seal call in simple noise conditions, develop a tracking algorithm that can adapt as complexity increases, such as in scenarios where multiple calls overlap or vessel noise interferes.
- You are encouraged to explore both classic signal processing techniques and machine learning-based approaches (e.g., CNNs) to optimize your tracker for seal vocalizations.
- Classification of Call Types:
- Seal calls can vary significantly. You will be provided with a catalogue of typical seal call spectrograms, categorized by common call types and approximate frequency bands. Your goal is to classify detected calls based on these categories.
- A classifier based on frequency and temporal features will be essential. Consider incorporating feature extraction methods (e.g., short-time Fourier transform, power spectrum, correlation) to distinguish subtle call characteristics.
- As noise increases in complexity, adapt your classifier to account for changes in call duration, frequency range, and multi-part structures.
- Visualization for Acoustic Operators:
- Design a visualization of your detection, tracking, and classification results that could aid a bioacoustic analyst or sonar operator. Display key metrics (e.g., intensity, frequency range, duration) and classification confidence for each detected call. Consider using heatmaps, spectrogram overlays, or other visual cues to highlight potential seal calls.
- Integrate options for analysts to review and verify the classified calls, and for the system to learn from corrections over time if using machine learning.
Increasing Complexity Over Time
- Stage 1: Detect and classify a single call in a low-noise environment.
- Stage 2: Introduce multiple overlapping seal calls and natural interference, such as wave noise.
- Stage 3: Add anthropogenic noise, including distant vessel sounds, and test the algorithm’s robustness against various interference types.
- Stage 4: Integrate the task of detecting and tracking calls with changing frequency content over time, requiring algorithms to adapt to variations in call types or durations.
Core Challenge Focus Areas
To tailor the challenge further to their expertise, allow participants to focus on one or more of the following:
- Tracking Algorithm Optimization: Improve detection and tracking algorithms to handle increasingly complex scenarios, such as multiple overlapping calls and varying background noise levels.
- Frequency Analysis for Classification: Apply advanced signal processing or machine learning to analyse the frequency content of detected calls, improving classification accuracy across different call types.
- Noise Mitigation Techniques: Develop noise suppression methods that effectively isolate seal calls from background and interference noise, leveraging techniques like per-channel energy normalization or PCA.
Suggested Technical Approaches
- Feature Extraction: Use signal processing tools like STFT, wavelet transforms, or cepstral analysis to capture the unique characteristics of seal calls, which often vary in duration and frequency modulation.
- Tracking Models: Consider methods such as particle filters, Kalman filters, or graph-based tracking to maintain track continuity, even in cases of call overlap or noise masking.
- Classification Models: Experiment with simple classifiers (e.g., random forest, SVM) for initial stages, and CNNs or other deep learning architectures for more nuanced classification, particularly in multi-call environments.
Pre-Requisites
Participants should have a background in signal processing and familiarity with MATLAB, Python, or similar programming environments. Libraries such as NumPy, SciPy, Matplotlib, and machine learning frameworks like TensorFlow or PyTorch will be helpful for this challenge.
Evaluation Metrics
- Detection and Classification Accuracy: Measure performance in correctly identifying, tracking, and classifying seal calls, compared to ground-truth data.
- Adaptability: Test algorithm resilience as complexity increases, particularly in cases with high interference or multiple overlapping calls.
- Visualization Quality: Assess the clarity and usability of the visualization for operators, ensuring that the output is intuitive and informative.