Topview Logo
  • Create viral videos with
    GPT-4o + Ads library
    Use GPT-4o to edit video empowered by Youtube & Tiktok & Facebook ads library. Turns your links or media assets into viral videos in one click.
    Try it free
    gpt video

    Create Your Own AI Meme Search Engine using Jina

    blog thumbnail

    Introduction

    Introduction

    In this workshop, we'll be building a search engine for memes using Jina, an open-source neural search framework. Jyoti and Alex from Jina AI will guide you through how to create, deploy, and understand a meme search engine using both text and images.

    Overview

    The session was hosted by Jyoti and Alex from Jina AI. They discussed the power of neural search, how it leverages AI models, and how it can be implemented using Jina's open-source tools. This article will present the step-by-step tutorial presented in the workshop, including basic setup and deployment.

    Detailed Steps

    Initial Setup

    First, we prepare the environment by importing the necessary libraries and setting up our basic configurations:

    import warnings
    warnings.filterwarnings('ignore')
    import os
    from google.colab import files
    import json
    from jina import Document, DocumentArray, Flow
    from jina.types.document.generators import from_csv
    

    Downloading the Dataset

    We use a data set from Kaggle called the Image Flip Meme dataset:

    !wget 'https://raw.githubusercontent.com/alexcg1/jina-meme-search/master/data/memes.json' -P data/
    

    Loading the Data

    A custom function to load the JSON data:

    def load_data(filepath, max_docs):
        memes = DocumentArray()
        with open(filepath, 'r') as file:
            raw_meme_data = json.load(file)
            for meme in raw_meme_data[:max_docs]:
                doc = Document(
                      text=f"(meme['template']): (meme['caption_text'])",
                      tags=meme
                    )
                memes.append(doc)
        return memes
    

    Load and shuffle the data:

    docs = load_data('data/memes.json', 50, True)
    

    Setting Up the Flow

    Next, we need to create a Jina flow that will process the data through an encoder and an indexer:

    f = (Flow()
        .add(uses='jinahub://TransformerTorchEncoder')
        .add(uses='jinahub://SimpleIndexer', 
             uses_with=('index_file_name': 'index'), 
             install_requirements=True)
    )
    

    Indexing the Data

    We process the data through the flow to index it:

    with f:
        f.post(on='index', inputs=docs, show_progress=True)
    

    Searching Through the Index

    We create a simple search function to send queries to the flow:

    def query_search(text):
        query_doc = Document(text=text)
        with f:
            result = f.search(inputs=query_doc, return_results=True)
        return result
    

    Example query:

    search_results = query_search('school')
    

    Displaying Results

    Finally, we use matplotlib to plot the search results:

    import matplotlib.pyplot as plt
    import requests
    from PIL import Image
    from io import BytesIO
    
    def show_results(results):
        fig, axs = plt.subplots(1, len(results), figsize=(20,10))
        for ax, res in zip(axs, results):
            response = requests.get(res.tags['image_url'])
            img = Image.open(BytesIO(response.content))
            ax.imshow(img)
            ax.axis('off')
            ax.set_title(res.text)
        plt.show()
    
    ## Introduction
    show_results(search_results)
    

    Conclusion

    In this workshop, we covered the installation and setup process for using Jina for neural search. The example demonstrated how you can create a meme search engine using text-based embeddings. From setting up the environment to processing the data and indexing it for search, you should now have a basic understanding of how neural search works using Jina.

    Keywords

    • Jina AI
    • Neural Search
    • Meme Search Engine
    • Document Array
    • Indexing
    • Query Search
    • Flow
    • TransformerTorchEncoder
    • SimpleIndexer

    FAQ

    Q1: Can I use Jina for data types other than text and images?

    A1: Yes, Jina supports multiple data types including text, images, audio, video, and even advanced types like 3D mesh.

    Q2: What is the advantage of using Jina over traditional search engines?

    A2: Jina uses neural networks to understand the semantic meaning of data, offering more accurate and meaningful search results compared to traditional search engines that rely on keyword matching.

    Q3: How does Jina handle dependencies for different machine learning models?

    A3: Jina uses Docker to sandbox different environments, thus preventing dependency clashes and ensuring smooth operation across various models.

    Q4: Is it possible to fine-tune the model used in the search engine?

    A4: Yes, you can fine-tune the models using Jina's FineTuner which allows for specialized tuning of models to better handle specific types of data.

    Q5: What do I do if I run into issues while using Jina?

    A5: You can always reach out to the Jina community on Slack, or open an issue on their GitHub repository. The team is very responsive and happy to help with any problems you encounter.

    One more thing

    In addition to the incredible tools mentioned above, for those looking to elevate their video creation process even further, Topview.ai stands out as a revolutionary online AI video editor.

    TopView.ai provides two powerful tools to help you make ads video in one click.

    Materials to Video: you can upload your raw footage or pictures, TopView.ai will edit video based on media you uploaded for you.

    Link to Video: you can paste an E-Commerce product link, TopView.ai will generate a video for you.

    You may also like