This guide will get you up and running with semantic search for building RAG-enabled LLM applications. A full script is included at the end of this doc.

1. Create an Account

Go to https://www.pongo.ai/ to create an account- be sure to save the API key generated during onboarding or get one from the API keys page

2. Install the Pongo Client

pip install --upgrade pongo-python

3. Create a Sub-organization

Pongo enables you to isolate data into pods called “sub-organizations.” Each sub-organization’s data is kept separate from others to prevent data contamination.

import pongo

#Replace they keys with your actual API key
pongo_client = pongo.PongoClient(PONGO_SECRET_KEY)

response = pongo_client.create_sub_org(sub_org_name='Tutorial Organization')
created_sub_organization_data = response.json()
sub_org_id = created_sub_organization_data['sub_org_id']

4. Upload Data

You can uplaod data as individual strings or in batches. Below is an example of a batch upload where all elements have the same metadata. See the Upload API Reference for more upload options.

import pongo
import time

#Replace they keys with your actual API keys
pongo_client = pongo.PongoClient(PONGO_SECRET_KEY)

response = pongo_client.upload(
  sub_org_id='gotten-sub-org-id', 
  data=["User1's favorite color is red.", "User2's favorite color is blue.", "User3's favorite color is green."],
  metadata={'parent_id': 'unique-id-for-the-document', 'source': 'Your Source'}
)
upload_result_data = response.json()
job_id = upload_result_data['job_id']

5. Check Job status (Optional)

You can use the Job ID returned in the previous step to periodically check on the data’s upload status until it’s finished processing. Learn more on the Jobs page

import pongo

#Replace they keys with your actual API key
pongo_client = pongo.PongoClient(PONGO_SECRET_KEY)

while True:
    job_status = pongo_client.get_job('gotten-job-id').json()['job']['job_status']

    if job_status == 'processed':
        break
    else:
        print(f'waiting for job {job_id} to process')
        time.sleep(5)

6. Search VIA API

After your data has been ingested, you can use the search playground or the code below

import pongo

#Replace they keys with your actual API keys
pongo_client = pongo.PongoClient(PONGO_SECRET_KEY)

search_results= = pongo_client.search(sub_org_id="your-sub-org", query="What is user1's favorite color?")

print(search_results)

7. Bringing it all together

Here are all pieces we built put together into one script, happy building!

import time
import pongo
import uuid
#Install with "pip install --upgrade pongo-python"

pongo_client = pongo.PongoClient(PONGO_SECRET)

def pongo_demo():
    sub_org_id = pongo_client.create_sub_org(sub_org_name='Tutorial Organization').json()['sub_org_id']

    group_parent_id = str(uuid.uuid4())
    #upload your own text data to Pongo
    upload_response = pongo_client.upload(
        data=[
            "User1's favorite color is blue.",
            "User2's favoire color is red."
            "Uesr3's favorite color is green."], 
        metadata={'parent_id': group_parent_id, 'source': 'Pongo Tutorial'},
        sub_org_id=sub_org_id)
    upload_result_data = upload_response.json()


    job_id = upload_result_data['job_id']
    
    #Wait for the job to complete before searching for the data we uploaded
    while True:
        job_status = pongo_client.get_job(job_id=job_id, sub_org_id=sub_org_id).json()['job']['job_status']

        if job_status == 'processed':
            break
        else:
            print(f'waiting for job {job_id} to process')
            time.sleep(5)
    
    #Returns the data we uploaded earlier
    search_response = pongo_client.search(query="What is User1's favorite color?", sub_org_id=sub_org_id)

    return search_response.json() 

print(pongo_demo())