GenAI – How To Optimize User Query Component ?


GenAI – How To Optimize User Query Component ?

Table Of Contents:

  1. What Is Query Input Component?
  2. Network Optimization Techniques.
  3. Use HTTP/2 or gRPC
  4. Compress Payloads
  5. Avoid Cold Start Problem

(1) What Is Query Input Component ?

(2) Network Optimization Techniques.

(3) Use HTTP/2 or gRPC

(4) Compress Payloads

Use Compression (gzip or Brotli)

import gzip
import requests

query = {
    "user_query":"..."  # a very large string
}

#Compress JSON

compressed_data = gzip.compress(bytes(str(query), 'utf-8'))

headers = {
    "Content-Encoding": "gzip",
    "Content-Type": "application/json"
}

response = request.post("http://localhost:8000/rag/query", data=compressed_data, headers=headers)

Use Decompression (gzip or Brotli)

from fastapi import FastAPI, Request
import gzip
import json

app = FastAPI()

@app.post("rag/query")
async def rag_query(request : Request):
# Check if the client sent gzip compressed data
    if request.headers.get("Content-Encoding") == "gzip":
        # Read the raw bytes from the request body
        compressed_body = await request.body()
        # Decompress the gzip bytes
        decompressed_bytes = gzip.decompress(compressed_body)
        # Decode bytes to string and parse JSON
        data = json.loads(decompressed_bytes.decode('utf-8'))
    else:
        # If not compressed, parse JSON normally
        data = await request.json()
    
    user_query = data.get("user_query")
    return {"received_query": user_query}

(5) Avoid Cold Start Problem

Leave a Reply

Your email address will not be published. Required fields are marked *