This week, I started working on the Active Learning integration part. The main goal is to enable efficient selection of images for annotation, making DeepForest more intelligent and scalable without locking it into a specific annotation platform.

Active Learning: A basic Introduction

Active learning is a machine learning approach where the model chooses the data from which it learns. Rather than labeling everything, we pick the most informative samples, the ones that, if labeled, would yield the greatest model improvement. This helps reduce labeling effort significantly while maintaining model performance.

In our case, this means selecting which images to annotate to improve the model in the fewest iterations.

The Workflow

We thought of implementing the active learning pipeline in four major steps:

Connect a pool of unlabeled imagery to the model and apply a selection strategy.
Serve the selected images to an annotation platform (we're prototyping with Label Studio, though DeepForest won’t formally depend on any specific tool).
(Optional) Integrate model predictions to assist in annotation.
- 3a. Pre-annotate using DeepForest predictions (similar to BOEM’s setup).
- 3b. Use Label Studio’s model backend through Docker (for future advanced use).
Retrieve annotations via API using Label Studio’s interface and keys.

While these steps define the broader vision, I initially jumped ahead and focused heavily on the backend integration (Step 3), which made me waste a lot of time because it was simply beyond the scope of my knowledge with label studio. I mistakenly assumed that users would submit a list of selected images to annotate, which is the reason why I skipped the initial 2 steps. But, in reality, the users provide a pool of unlabeled images, and the system should help them choose which ones to annotate based on active learning strategies.

This realization led to change in my thinking and approach.

Work done in this week

I developed a CLI tool that takes in a pool of unlabeled images, the desired number of images to annotate, and a selection strategy. It returns a ranked list of images to prioritize for annotation (There are 2 outputs a .txt file and a .csv file, the csv file contains the preannotations). This is a very basic implementation only used to check whether the pipeline is working or not and I will update it later.

    # Example usage of the CLI:

   python CLI.py `
   >>   --image_folder .\Vulture_03_28_2022 `
   >>   --patch_size 512 `
   >>   --patch_overlap 0.1 `
   >>   --min_score 0.1 `
   >>   --min_detection_score 0.6 `
   >>   --confident_threshold 0.5 `
   >>   --strategy most-detections `
   >>   --n 2 `
   >>   --batch_size 16 `
   >>   --pool_limit 5 `
   >>   --ls_project_id 42 `
   >>   --output_images random_detected_images.txt `
   >>   --output_csv random_preannotations.csv

The current image selection logic is quite basic. Currently the image selection is only done on the basis of some basic functions like :
- random selections
- images with most-detections
- images with target labels
- images with rarest predictions
  
  This has laid the foundation for future integration of real active learning techniques like uncertainty sampling and diversity-based sampling.
I gained hands-on familiarity with Label Studio (I accidentally did this part earlier, it was supposed to be done later), including:
- Used label studio myself to learn its workflow.
- Connected a pool of unlabeled data to the selection pipeline
- Implemented the SAM model example
- Created a basic own ML backend.

Even though backend integration is not the immediate priority, this helped me understand how Label Studio works under the hood learning that will pay off in the coming weeks.

Challenges and Learnings

Wrong Understanding of the work flow: I originally misunderstood the user flow. Fixing this helped restructure the entire approach and reaffirm the importance of spending more time on steps 1–2 before backend concerns.
Label Studio Integration: While not immediately needed, exploring its backend and preannotation workflows (like those in the BOEM repo) gave me a good understanding of what will be needed later.
Paper reading: I also read some papers related to advanced annotation strategies using SAM + Grounding DINO (used in detection and counting tasks). Though not directly applicable yet, they open up future possibilities for better annotation localization.

Next Steps

Improve image selection methods with proper active learning strategies like uncertainty and diversity sampling.
Refactor the CLI to properly handle the full flow from pool input to image output.

Community Bonding Period: Building the Foundation for Active Learning in DeepForest