Building a Computer Vision App
This guide outlines the end-to-end process for planning, building, and deploying a computer vision application with Plainsight Filters.
1. Creating the Specification
-
Define the Filter’s Purpose
- Clearly state what your filter should do—e.g., detect vehicles, classify products, track specific objects, or apply transformations.
- Align the filter functionality with business requirements (counting objects, measuring distances, extracting text, etc.).
-
Develop Subject Data (the “Spreadsheet Exercise”)
- Identify the structured outputs your computer vision solution must produce.
- For example, do you need bounding boxes, classification labels, or numeric measurements?
- Think of it as columns in a spreadsheet: each row represents a frame or event; each column is a piece of subject data.
-
Feasibility and Early Experiments
- Run quick experiments to see if existing models or computer vision techniques can meet your accuracy/speed requirements.
- Check whether GPU or CPU-based inference is needed, and confirm data availability and quality.
-
Write the Specification
- Summarize the filter’s objective, required inputs/outputs, and performance constraints.
- Identify which filters already exist in the Plainsight library (Utility Filters, Connectors, etc.) and which new Application Filters or Models must be developed.
- Mark down any newly required models that need to be trained or specialized.
2. Model Training
-
Collect and Curate Data
- Use Data Collection Recipes (e.g., a Data Acquisition filter) to gather images/videos and store them in a cloud bucket.
- Ingest this data into Encord or another labeling platform.
-
Create Labeling Guidelines & Ontologies
- Decide how your data will be annotated (bounding boxes, polygons, classification labels).
- Clearly define classes, attributes, and annotation best practices so that labelers follow a consistent approach.
-
Train Models Using Protege
- Configure a training job in Protege, pointing to your labeled dataset in Encord.
- Tune hyperparameters, run experiments, and evaluate performance metrics (precision, recall, etc.).
-
Publish Models
- Once satisfied, publish the model to an artifact registry (e.g., Google Artifact Registry, JFrog).
- The published model can then be integrated into filters via the standard packaging workflow.
3. Develop Filter(s)
-
Integrate the Model
- Create a new filter project (or update an existing Application Filter).
- Reference the published model artifact (e.g., using
jfrog://
orgcr://
URIs). - Load the model in your filter’s
setup()
method.
-
Build Computer Vision Logic
- Add code to perform inference on each frame.
- Post-process results to produce subject data (e.g., bounding boxes, counts, heatmaps).
- Convert inference output into the structured data defined in your specification.
-
Test Filter Using Al Haytham
- Run benchmark tests and functional tests to validate the filter’s output.
- Compare predicted outputs against a ground truth or expected reference to ensure accuracy.
4. Build Pipeline Logic
-
Use a Stub Application Filter
- If some filters are not finished (e.g., the model is still training or logic is incomplete), create stub placeholders.
- These stubs simulate data or pass frames through so you can build the rest of the pipeline.
-
Describe Filter Graph
- In your
docker-compose.yaml
or equivalent definition, list:- Sources (e.g.,
video_in
, RTSP, file) - Utility Filters (e.g., de-warp, ROI cropping)
- Application Filters (your new filter)
- Connectors (data sinks, message queues)
- Sources (e.g.,
- Ensure each filter references the correct
FILTER_SOURCES
andFILTER_OUTPUTS
.
- In your
5. Test the Pipeline (Al Haytham)
-
Integration Testing
- Spin up the entire pipeline using Docker Compose or your orchestration environment (K8s, VM, etc.).
- Send test data (video or images) through the pipeline.
-
Validation
- Use Al Haytham or a custom comparator to validate each filter’s output.
- Check logs, ensure frames pass correctly, and confirm subject data integrity.
6. Publish and Iterate
-
Publish
- Once tested, build production-ready Docker images with your filters.
- Push them to your artifact repository or Docker registry.
-
Deploy
- Deploy the pipeline to Edge (e.g., Docker Compose on a local machine) or Cloud (e.g., Jester, Kubeflow).
- Ensure your environment variables (GPU settings, concurrency) match your performance needs.
-
Monitor & Gather Feedback
- Collect logs, metrics, and subject data from the pipeline’s real-world runs.
- Adjust model accuracy, update filters, or scale infrastructure as needed.
-
Iterate
- As new requirements surface or performance dips, refine your filters, re-train models, or rewire the pipeline.
- Keep a continuous improvement loop so your vision solution evolves with changing real-world conditions.
Conclusion
- Create the Specification – Understand business needs, define subject data, confirm feasibility.
- Model Training – Gather data, label it in Encord, and train with Protege.
- Develop Filter(s) – Integrate your new model in a Docker-based Application Filter.
- Build Pipeline Logic – Chain your filters in Docker Compose (or other orchestrators).
- Test Pipeline – Use Al Haytham for integration tests and final verification.
- Publish & Iterate – Deploy, gather feedback, and refine your solution.
This iterative cycle helps deliver reliable, maintainable, and high-value computer vision solutions using the Plainsight Filters ecosystem.