Reverse Video Search

You may have used some kind of reverse image search before. Put simply, instead of searching using text: australian shepherds running, you can use an image: australian_shepherd_running.png. The search engine will then find all similar images based on that input.

But have you used reverse video search? The approach is the same: use your video as a query to find other videos.

What is Reverse Video Search?

Reverse video search enables users to find similar videos by using a video clip as the search input, rather than traditional text-based queries. This technology leverages advanced computer vision and machine learning to analyze and match visual content across video databases.

Key Components of Reverse Video Search

Component	Description
Feature Extraction	Processing videos to identify and encode visual elements, scenes, and patterns
Vector Embeddings	Converting visual features into numerical representations for efficient comparison
Similarity Matching	Algorithms that compare video embeddings to find similar content
Temporal Analysis	Processing that considers the sequential nature of video content

Understanding Through Image Search First

Before diving into video search, it's helpful to understand reverse image search, which follows similar principles but with still images.

How Reverse Image Search Works

Input Processing: The system takes an image as input
Feature Extraction: Analyzes visual elements like colors, shapes, and patterns
Similarity Matching: Compares these features against a database of images
Result Ranking: Returns similar images ranked by relevance

Try it on Google Images: https://images.google.com/

In the example below, I'll upload a picture of an Australian Shepherd dog, and Google's reverse image search will find all similar pictures of Australian Shepherds.

Example Use Cases for Reverse Image Search

Use Case	Description	Business Impact
E-commerce	Finding similar products from product images	Increased sales through visual discovery
Content Verification	Identifying original sources of images	Enhanced content authenticity
Brand Protection	Detecting unauthorized use of logos/images	Better intellectual property protection
Real Estate	Finding similar properties from photographs	Improved property matching

Image Feature Extraction

To perform a search we need to extract features from the image. Below, we're just leaving the default options, but you can go crazy with how many features you can pull out

import requests

url = "https://api.mixpeek.com/ingest/images/url"

payload = {
    "url": "https://www.akc.org/wp-content/uploads/2017/11/Australian-Shepherd.1.jpg",
    "collection": "sample_dogs",
    "feature_extractors": {
        "embed": [
            {
                "type": "url",
                "embedding_model": "image"
            }
        ]
    }
}
headers = {  
  'Authorization': 'Bearer API_KEY', # removed after for brevity
  "Content-Type": "application/json"
}

response = requests.request("POST", url, json=payload, headers=headers)

print(response.text)

Reverse Image Search

import requests

url = "https://api.mixpeek.com/features/search"

payload = {
    "queries": [
        {
            "type": "url",
            "value": "https://www.akc.org/wp-content/uploads/2017/11/Australian-Shepherd.1.jpg",
            "embedding_model": "image"
        },
    ],
    "collections": ["sample_dogs"]
}

Now let's reverse video search!

Reverse video search works the same way. We first embed a couple videos, then provide a sample video as a search.

For our index, we'll use a movie trailer from the 1940s classic, The Third Man:

Prepare the video(s)

We'll split the video up by 5 secton intervals, then embed each interval using the multimodal embedding model. We'll also pull out a description from each interval.

import requests
import json

url = "https://api.mixpeek.com/ingest/videos/url"

payload = json.dumps({
  "url": "https://mixpeek-public-demo.s3.us-east-2.amazonaws.com/media-analysis/The+Third+Man++Official+Trailer.mp4",
  "collection": "my_video_collection",
  "feature_extractors": [
    {
      "interval_sec": 5,
      "describe": {
        "enabled": True
      },
      "embed": [
        {
          "type": "url",
          "embedding_model": "multimodal"
        }
      ]
    }
  ]
})

response = requests.request("POST", url, headers=headers, data=payload)

print(response.text)

Embed the video to search and run!

Now we have a grainy video clip from some CCTV that we'll use for our reverse video search:

We'll do the same thing, only difference is we'll want the embedding from the video we want to search across the already indexed and embedded videos:

import requests

url = "https://api.mixpeek.com/features/search"

payload = {
    "queries": [
        {
            "type": "url",
            "value": "https://mixpeek-public-demo.s3.us-east-2.amazonaws.com/media-analysis/video_queries/exiting_sewer.mp4",
            "embedding_model": "multimodal",
        },
    ],
    "collections": ["my_video_collection"],
}

response = requests.request("POST", url, json=payload, headers=headers)

print(response.text)

Compare results

Now that we have our embeddings we can run a KNN search:

This will return an array of objects that we can use to render in our application indicating what the most similar video timestamps are based on the video embedding as a query

results = [
    {"start_time": 25.0, "end_time": 30.0, "score": 0.6265061},
    {"start_time": 5.0, "end_time": 10.0, "score": 0.6025797},
    {"start_time": 30.0, "end_time": 35.0, "score": 0.59880114},
]

Now if we look at the original video @ 25 seconds in:

Amazing, we found a challenging scene to describe using a video query as an input. Now imagine doing that across billions of videos 🤯

Using this template, we set it so that whenever a new object is added to our S3 bucket it's automatically processed and inserted into our database (connection established prior). Additionally, if a video is ever deleted from our S3 bucket its' embeddings are deleted from our database as well.

Applications and Use Cases

Industry	Use Case	Benefits
Content Creation	Finding specific scenes or clips	Streamlined editing process
Media Monitoring	Tracking content reuse across platforms	Better copyright enforcement
Security	Analyzing surveillance footage	Enhanced threat detection
E-commerce	Product discovery through video	Improved shopping experience

Want a reverse video proof of concept?

Additional Resources

For additional information and implementation details, refer to:

🚀 Runnable Example: Google Colab Notebook
💻 Available Models: Mixpeek Models Documentation
📓 Feature Extraction: Mixpeek Features Documentation
🧑🏻‍🏫 Multimodal University: Multimodal University Curriculum

Reverse Video Search

What is Reverse Video Search?

Key Components of Reverse Video Search

Understanding Through Image Search First

How Reverse Image Search Works

Example Use Cases for Reverse Image Search

Image Feature Extraction

Reverse Image Search

Now let's reverse video search!

Prepare the video(s)

Embed the video to search and run!

Compare results

Applications and Use Cases

Additional Resources

Ethan Steininger

Multimodal Makers | Mixpeek

Reverse Video Search

What is Reverse Video Search?

Key Components of Reverse Video Search

Understanding Through Image Search First

How Reverse Image Search Works

Example Use Cases for Reverse Image Search

Image Feature Extraction

Reverse Image Search

Now let's reverse video search!

Prepare the video(s)

Embed the video to search and run!

Compare results

Applications and Use Cases

Additional Resources

Ethan Steininger

How We Indexed the 1000 Top Movie Trailers for AI Apps

Risk Assessment with Video Understanding

Video Scene Detection Embedding Models

Introducing VUSE: Video Understanding and Semantic Embedding

Semantic Video Understanding

Multimodal Makers | Mixpeek