GenAI – How To Optimize User Query Component ?
Table Of Contents:
- What Is Query Input Component?
- Network Optimization Techniques.
- Use HTTP/2 or gRPC
- Compress Payloads
- Avoid Cold Start Problem
(1) What Is Query Input Component ?
(2) Network Optimization Techniques.
(3) Use HTTP/2 or gRPC
(4) Compress Payloads
Use Compression (gzip or Brotli)
import gzip
import requests
query = {
"user_query":"..." # a very large string
}
#Compress JSON
compressed_data = gzip.compress(bytes(str(query), 'utf-8'))
headers = {
"Content-Encoding": "gzip",
"Content-Type": "application/json"
}
response = request.post("http://localhost:8000/rag/query", data=compressed_data, headers=headers) Use Decompression (gzip or Brotli)
from fastapi import FastAPI, Request
import gzip
import json
app = FastAPI()
@app.post("rag/query")
async def rag_query(request : Request):
# Check if the client sent gzip compressed data
if request.headers.get("Content-Encoding") == "gzip":
# Read the raw bytes from the request body
compressed_body = await request.body()
# Decompress the gzip bytes
decompressed_bytes = gzip.decompress(compressed_body)
# Decode bytes to string and parse JSON
data = json.loads(decompressed_bytes.decode('utf-8'))
else:
# If not compressed, parse JSON normally
data = await request.json()
user_query = data.get("user_query")
return {"received_query": user_query}
(5) Avoid Cold Start Problem
