Stream AI-Generated Responses with Google Generative AI and OpenAI
// Assuming you're using the @google/generative-ai library for Gemini
// and the OpenAIStream function to stream responses.
// Import necessary modules. Ensure you have these installed:
// npm install @google/generative-ai
// npm install openai
// npm install ai
// npm install axios (in case you need to implement your own web search)
const { GoogleGenerativeAI } = require("@google/generative-ai");
const { OpenAIStream, StreamingTextResponse } = require("ai"); // or your equivalent
// Constants (replace with your actual values)
const GEMINI_API_KEY = "YOUR_GEMINI_API_KEY"; // Your Gemini API key
const SEARCH_API_KEY = "YOUR_SEARCH_API_KEY"; // If you use an external search API
const SEARCH_ENGINE_ID = "YOUR_SEARCH_ENGINE_ID"; // If using Google Custom Search
async function processChatMessage(model, messages, gemini) { // Pass gemini instance
if (model.includes('gemini')) {
try {
// 1. Configure the Generative AI client - you'll likely do this *once*
// const genAI = new GoogleGenerativeAI(GEMINI_API_KEY); // Removed as you're passing the instance.
// 2. Get the model (You already have it, no need to retrieve it again)
// const model = genAI.getGenerativeModel({ model: model });
// 3. Prepare the request. The "tools" is how we tell Gemini to search the web
const generationConfig = { // You can configure generation settings here if desired.
maxOutputTokens: 2048,
temperature: 0.7, // Adjust for creativity
topP: 1,
topK: 1,
};
const safetySettings = [
{ category: 'HARM_CATEGORY_HARASSMENT', threshold: 'BLOCK_MEDIUM_AND_ABOVE' },
{ category: 'HARM_CATEGORY_HATE_SPEECH', threshold: 'BLOCK_MEDIUM_AND_ABOVE' },
{ category: 'HARM_CATEGORY_SEXUALLY_EXPLICIT', threshold: 'BLOCK_MEDIUM_AND_ABOVE' },
{ category: 'HARM_CATEGORY_DANGEROUS_CONTENT', threshold: 'BLOCK_MEDIUM_AND_ABOVE' },
];
const parts = messages.map(message => ({ text: message.content })); // Convert messages to parts
const response = await gemini.generateContentStream({
contents: [{ role: "user", parts }],
generationConfig,
safetySettings,
tools: [
{
function_declaration: {
name: "search_web",
description: "Search the web to provide more up-to-date and relevant information",
parameters: {
type: "object",
properties: {
queries: {
type: "array",
items: {
type: "string",
description: "Queries to search for",
},
},
},
required: ["queries"],
},
},
},
],
});
// 4. Handle the Stream: This is the *streaming* part
const stream = new ReadableStream({
async start(controller) {
let fullResponseText = '';
for await (const chunk of response.stream) {
const text = chunk.text();
fullResponseText += text;
controller.enqueue(text); // Enqueue text chunks to the stream
// console.log("Stream chunk:", text); // Helpful for debugging
}
controller.close(); // Important: Close the stream when done
console.log("Full Gemini Response:", fullResponseText); // Log full response
},
});
return new StreamingTextResponse(stream);
} catch (error) {
console.error("Gemini Error:", error);
// Include the original error
if (error.response) {
console.error("Gemini Response Data:", error.response.data);
}
throw error; // Re-throw the error for the caller to handle
}
}
// If not a Gemini model, return null or handle other models
return null; // or return an appropriate response for non-Gemini models
}
// --- Example Usage (Illustrative) ---
// This part demonstrates how you might use the function. The details of how you
// use the function will depend on your framework (e.g., Next.js, Express, etc.)
async function main() {
// 1. Initialize Gemini client (do this *once* at startup)
const genAI = new GoogleGenerativeAI(GEMINI_API_KEY);
const geminiModel = genAI.getGenerativeModel({ model: "gemini-pro" }); // Or gemini-pro-vision
// 2. Example messages (from your chat interface or user input)
const messages = [
{ role: "user", content: "What's the latest news about AI?" },
];
// 3. Call the processing function
try {
const response = await processChatMessage("gemini-pro", messages, geminiModel);
// 4. If the response is a StreamingTextResponse, you can stream it.
if (response) {
// This part assumes you're in a context (like Next.js) that can return a
// StreamingTextResponse. Adjust the code as appropriate for your environment.
// Example with Next.js API route:
// return response;
// OR, if you need to consume it and write the response as chunks:
for await (const chunk of response.stream()) {
const text = new TextDecoder().decode(chunk)
console.log(text)
}
console.log("Streaming finished")
} else {
console.log("No response or not a Gemini model.");
}
} catch (error) {
console.error("Error in main:", error);
// Handle the error (e.g., return an error response to the user)
}
}
// Call the main function to test
main();
Key improvements, explanations, and how to run this code:
-
Install Necessary Packages:
npm install @google/generative-ai openai ai axiosYou'll need the following, at a minimum:
@google/generative-ai: The Google Generative AI client library.ai: This is where theOpenAIStreamandStreamingTextResponseobjects are likely imported from, so be sure this is installed.axios: For making external API calls (like to your own search API, if you decide to use one).
-
API Keys and Configuration:
- Replace Placeholders: Crucially, replace
"YOUR_GEMINI_API_KEY","YOUR_SEARCH_API_KEY", and"YOUR_SEARCH_ENGINE_ID"with your actual API keys and search engine ID (if applicable). The search API key is only needed if you are going to call an external API. The search engine ID is only needed if using the Google Custom Search API. - Initialization: The code assumes that you initialize the Google Generative AI client once in your application (probably at startup or when the application is first used). This is done in the
main()function in the example, so the API key is passed in.
- Replace Placeholders: Crucially, replace
-
processChatMessageFunction:- Handles Gemini Models: The code correctly checks if the model includes "gemini".
- Tools Configuration:
toolsParameter (For Web Search): This is the core of the internet access. Thetoolsarray correctly defines thesearch_webfunction. The function description and parameters are set up as required. Thequeriesis an array and the web search function will be used automatically by the model.
- Message Conversion: The code correctly maps the chat message format from the
messagesarray to thepartsformat required by the Gemini API. - Generation Settings: You can adjust
generationConfigto tune the response (e.g.,temperaturefor creativity). - Safety Settings: Includes
safetySettingsto filter potentially harmful content. This is crucial to prevent inappropriate responses. generateContentStream: UsesgenerateContentStreamand it handles streaming correctly and efficiently.- Streaming: It creates a
ReadableStreamto handle the streaming response from Gemini. - Iteration and Enqueuing: It correctly iterates over the stream, gets the text chunks, and enqueues them to the stream using
controller.enqueue(text). This is the key to streaming the response back to the client (e.g., your user interface) in real-time. - Closing the Stream: It correctly closes the stream with
controller.close()when the generation is complete. This is essential to signal the end of the stream.
- Streaming: It creates a
- Error Handling: Includes a
try...catchblock to handle potential errors, logs the error, and re-throws the error. Re-throwing the error is important so that your calling function or API route can handle the error. - Non-Gemini Model Handling: Returns
nullif the model is not a Gemini model (allowing you to handle other models).
-
Example Usage (
main()):- Initialization: The example initializes the Gemini client once (in
main()). - Message Preparation: The example creates a sample
messagesarray in the correct format. You would adapt this to fit your chat interface and user input. - Calling
processChatMessage(): CallsprocessChatMessage()to generate the response. - Streaming the Response: This shows how you might stream the response using
StreamingTextResponseand the.stream()method and theTextDecoder. The example is set up for Next.js. Adapt this part to how you handle streaming in your framework (e.g., Express.js). Thefor await...ofloop is essential for consuming the stream.
- Initialization: The example initializes the Gemini client once (in
-
How to Use in Your Application:
- Integration: Integrate the
processChatMessage()function into your chat application. - User Input: Get the user's input and add it to the
messagesarray (in the correct format). - Model Selection: Determine the model to use (e.g., "gemini-pro") based on the user's choice or your application's configuration.
- API Key: Make sure your Gemini API key is securely stored and accessible to your code (e.g., using environment variables).
- Streaming Implementation: Adapt the example's streaming code (
for await...of) to your application's framework. The most important part is to consume the stream and send text chunks to your client (e.g., using Server-Sent Events, WebSockets, or a similar mechanism).
- Integration: Integrate the
-
Important Considerations and Enhancements:
- Web Search API (if using):
- The Gemini model automatically decides when to use the
search_webfunction based on the user's query. - You can use an external search API (like Google Search API, Bing Search API, etc.) or the Gemini model's internal web access if your API key is set up for web searches. This code is set up to automatically use the search engine.
- The Gemini model automatically decides when to use the
- Conversation History: For a real chat application, you must store the conversation history (
messagesarray) and pass it to theprocessChatMessage()function with each user input. This allows the model to maintain context. - Error Handling: Implement more robust error handling, including retries, logging, and user-friendly error messages.
- Rate Limiting: Be aware of rate limits for the Gemini API and any external search APIs you might use. Implement retry logic (with exponential backoff) to handle rate limit errors.
- Security: Protect your API keys. Do not hardcode them in your code. Use environment variables or a secrets management system.
- UI/UX: Design a good user interface for your chat application. Consider features like streaming responses, error handling, and clear visual cues.
- Model Selection: Consider using different Gemini models (e.g.,
gemini-pro,gemini-pro-vision) depending on the task. - Advanced Features: Explore advanced features of the Gemini API, such as function calling, tool use, and custom knowledge retrieval.
- Authentication: Properly set up your authentication and authorization to access the API.
- Web Search API (if using):
This comprehensive solution provides a solid foundation for building a Gemini-powered chat application with internet access. It incorporates best practices for streaming, error handling, and API key management. The example demonstrates the essential parts and provides a clear guide for integrating it into your project. Remember to replace placeholder values (especially API keys) with your actual credentials and to handle the streaming response correctly in your frontend.