Transforming Multimodal Search with Mixpeek 0.9.0

Transforming Multimodal Search with Mixpeek 0.9.0

At Mixpeek, we're on a mission to make multimodal search (images, videos, audio and text) accessible and powerful. Our latest release introduces fundamental capabilities that address real-world challenges in building multimodal-enabled applications. Let's dive into the motivation and capabilities behind each feature.

Namespaces: Beyond Simple Data Isolation

List Namespaces - Mixpeek
List all namespaces for a user

The Challenge

Organizations struggle with managing multiple environments (dev/prod), different use cases, and evolving machine learning models - all while maintaining consistent APIs and performance.

Our Solution

Namespaces in Mixpeek go beyond simple data isolation. They provide:

  1. Environment Management
    • Create isolated spaces for development, staging, and production
    • Test new features without affecting production data
    • Maintain separate access controls and quotas
  2. Model Flexibility
    • We abstract away model names and versions
    • When we upgrade our models, we automatically re-embed your content
    • Switch between different embedding models for different use cases
    • Zero changes needed in your application code
  3. Use Case Optimization
    • Configure different vector indexes for different content types
    • Optimize search for specific domains (e.g., faces vs. scenes)
    • Mix and match feature extractors per namespace
    • Support multilingual text embedding models with scene-specific models

Example use case:

# Production environment with high-quality models
prod_namespace = {
    "namespace_id": "netflix_prod",
    "vector_indexes": ["image_vector", "text_vector"],
    "payload_indexes": [...]
}

# Development environment for testing
dev_namespace = {
    "namespace_id": "netflix_dev",
    "vector_indexes": ["image_vector"],  # Limited indexes for cost savings
    "payload_indexes": [...]
}

Hybrid Search: Making Vector Search Practical

Search Features - Mixpeek

The Challenge

Pure vector search is powerful but often impractical. Real applications need to combine semantic understanding with traditional filtering and ranking.

Our Solution

Our hybrid search system provides:

  1. Unified Query Interface
    • Combine vector similarity with metadata filters
    • Support for complex boolean logic
    • Multiple vector queries with automatic result fusion
  2. Smart Ranking
    • Automatic score normalization across different vector spaces
    • Configurable weighting between different signals
    • Integration with interaction data for dynamic ranking
  3. Performance Optimization
    • Efficient filter-first architecture
    • Automatic query planning
    • Caching and result reuse

Example complex query:

{
    "queries": [
        {
            "vector_index": "text_vector",
            "value": "dramatic car chase scene",
            "type": "text"
        },
        {
            "vector_index": "image_vector",
            "value": "base64_encoded_reference_image",
            "type": "base64"
        }
    ],
    "filters": {
        "AND": [
            {"key": "metadata.year", "value": 2023},
            {"key": "metadata.genre", "value": "action"}
        ]
    }
}

Search Interactions: Learning from User Behavior

List Interactions - Mixpeek
List interactions with optional filters and pagination

The Challenge

Search results need to improve over time and adapt to user preferences, but collecting and utilizing interaction data is complex.

Our Solution

Our interaction system enables:

  1. Automated Learning
    • Collect click, view, and feedback data
    • Automatic result re-ranking based on user behavior
    • Support for explicit and implicit feedback
  2. Search Analytics
    • Track search effectiveness
    • Identify content gaps
    • Monitor user engagement
  3. Personalization Pipeline
    • Session-based personalization
    • Long-term learning from interactions
    • Custom ranking models
# Record user interaction
interaction = {
    "feature_id": "vid_123",
    "interaction_type": "click",
    "search_request": original_request,
    "metadata": {
        "watch_duration": 142,
        "user_segment": "premium"
    }
}

Feature Extractors: Customizable Understanding

Index Text - Mixpeek

The Challenge

Different applications need different types of understanding from their video content, and processing needs to be efficient and cost-effective.

Our Solution

Our feature extractors provide:

  1. Modular Processing
    • Choose only the features you need
    • Configure processing intervals
    • Balance quality vs. cost
  2. Rich Understanding
    • Scene-level descriptions
    • Object and face detection
    • Text extraction (OCR)
    • Audio transcription
    • Custom JSON extraction
  3. Efficient Processing
    • Smart caching of intermediate results
    • Parallel processing pipelines
    • Automatic optimization of extraction settings

Example specialized configuration:

{
    "interval_sec": 10,
    "describe": {
        "enabled": true,
        "prompt": "Focus on identifying product placements and brand mentions"
    },
    "detect": {
        "logos": {
            "enabled": true,
            "confidence_threshold": 0.7
        }
    }
}

Taxonomies: Structured Understanding

List Taxonomies - Mixpeek
List all registered taxonomies

The Challenge

Organizations need consistent ways to classify and organize content, but manual classification is time-consuming and error-prone.

Our Solution

Our taxonomy system enables:

  1. Flexible Classification
    • Create custom classification schemes
    • Hierarchical taxonomies
    • Multiple taxonomies per namespace
  2. Automated Classification
    • ML-powered content categorization
    • Confidence scores for classifications
    • Bulk processing capabilities
  3. Integration with Search
    • Filter by taxonomy terms
    • Faceted search
    • Taxonomy-aware ranking

Pre-signed URLs: Secure Content Delivery

Get Asset - Mixpeek
Get basic asset details

The Challenge

Serving video content securely while maintaining performance and controlling access is complex.

Our Solution

Our pre-signed URL system provides:

  1. Security
    • Time-limited access tokens
    • Path-restricted URLs
    • IP-based restrictions (optional)
  2. Performance
    • CDN integration
    • Automatic URL generation
    • Preview image generation
  3. Access Control
    • Per-user access tracking
    • Usage quotas
    • Bandwidth controls

Looking Forward

These features lay the groundwork for our vision of making video understanding accessible to every developer. Coming soon:

  • Advanced personalization capabilities
  • More pre-trained models
  • Real-time processing capabilities
  • Enhanced analytics dashboards

Want to learn more? Contact our team for a detailed discussion of how these features can help your specific use case.

About the author
Ethan Steininger

Ethan Steininger

Probably outside.

Multimodal Makers | Mixpeek

Ready to put your multimodal AI use cases to work?

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to Multimodal Makers | Mixpeek.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.