Home

Stream AI-Generated Responses with Google Generative AI and OpenAI

22 views
// Assuming you're using the @google/generative-ai library for Gemini
// and the OpenAIStream function to stream responses.

// Import necessary modules.  Ensure you have these installed:
// npm install @google/generative-ai
// npm install openai
// npm install ai
// npm install axios (in case you need to implement your own web search)

const { GoogleGenerativeAI } = require("@google/generative-ai");
const { OpenAIStream, StreamingTextResponse } = require("ai"); // or your equivalent

// Constants (replace with your actual values)
const GEMINI_API_KEY = "YOUR_GEMINI_API_KEY"; // Your Gemini API key
const SEARCH_API_KEY = "YOUR_SEARCH_API_KEY"; // If you use an external search API
const SEARCH_ENGINE_ID = "YOUR_SEARCH_ENGINE_ID"; // If using Google Custom Search

async function processChatMessage(model, messages, gemini) { // Pass gemini instance
  if (model.includes('gemini')) {
    try {
      // 1. Configure the Generative AI client - you'll likely do this *once*
      // const genAI = new GoogleGenerativeAI(GEMINI_API_KEY); // Removed as you're passing the instance.

      // 2.  Get the model (You already have it, no need to retrieve it again)
      // const model = genAI.getGenerativeModel({ model: model });

       // 3. Prepare the request. The "tools" is how we tell Gemini to search the web
      const generationConfig = { // You can configure generation settings here if desired.
        maxOutputTokens: 2048,
        temperature: 0.7,  // Adjust for creativity
        topP: 1,
        topK: 1,
      };

      const safetySettings = [
        { category: 'HARM_CATEGORY_HARASSMENT', threshold: 'BLOCK_MEDIUM_AND_ABOVE' },
        { category: 'HARM_CATEGORY_HATE_SPEECH', threshold: 'BLOCK_MEDIUM_AND_ABOVE' },
        { category: 'HARM_CATEGORY_SEXUALLY_EXPLICIT', threshold: 'BLOCK_MEDIUM_AND_ABOVE' },
        { category: 'HARM_CATEGORY_DANGEROUS_CONTENT', threshold: 'BLOCK_MEDIUM_AND_ABOVE' },
      ];

      const parts = messages.map(message => ({ text: message.content })); // Convert messages to parts

      const response = await gemini.generateContentStream({
          contents: [{ role: "user", parts }],
          generationConfig,
          safetySettings,
          tools: [
            {
              function_declaration: {
                name: "search_web",
                description: "Search the web to provide more up-to-date and relevant information",
                parameters: {
                  type: "object",
                  properties: {
                    queries: {
                      type: "array",
                      items: {
                        type: "string",
                        description: "Queries to search for",
                      },
                    },
                  },
                  required: ["queries"],
                },
              },
            },
          ],
        });

      // 4. Handle the Stream:   This is the *streaming* part
      const stream = new ReadableStream({
        async start(controller) {
          let fullResponseText = '';
          for await (const chunk of response.stream) {
            const text = chunk.text();
            fullResponseText += text;
            controller.enqueue(text); // Enqueue text chunks to the stream
            // console.log("Stream chunk:", text); // Helpful for debugging
          }
          controller.close();  // Important: Close the stream when done
          console.log("Full Gemini Response:", fullResponseText); // Log full response
        },
      });

      return new StreamingTextResponse(stream);

    } catch (error) {
      console.error("Gemini Error:", error);
       // Include the original error
      if (error.response) {
          console.error("Gemini Response Data:", error.response.data);
      }
      throw error; // Re-throw the error for the caller to handle
    }
  }

  // If not a Gemini model, return null or handle other models
  return null; // or return an appropriate response for non-Gemini models
}


// --- Example Usage (Illustrative)  ---
// This part demonstrates how you might use the function.  The details of how you
// use the function will depend on your framework (e.g., Next.js, Express, etc.)
async function main() {
  // 1. Initialize Gemini client (do this *once* at startup)
  const genAI = new GoogleGenerativeAI(GEMINI_API_KEY);
  const geminiModel = genAI.getGenerativeModel({ model: "gemini-pro" }); // Or gemini-pro-vision

  // 2. Example messages (from your chat interface or user input)
  const messages = [
    { role: "user", content: "What's the latest news about AI?" },
  ];

  // 3.  Call the processing function
  try {
    const response = await processChatMessage("gemini-pro", messages, geminiModel);

    // 4.  If the response is a StreamingTextResponse, you can stream it.
    if (response) {
      // This part assumes you're in a context (like Next.js) that can return a
      // StreamingTextResponse.  Adjust the code as appropriate for your environment.
      // Example with Next.js API route:
      //  return response;
      //   OR, if you need to consume it and write the response as chunks:

      for await (const chunk of response.stream()) {
        const text = new TextDecoder().decode(chunk)
        console.log(text)
      }
      console.log("Streaming finished")

    } else {
      console.log("No response or not a Gemini model.");
    }
  } catch (error) {
    console.error("Error in main:", error);
    // Handle the error (e.g., return an error response to the user)
  }
}

// Call the main function to test
main();

Key improvements, explanations, and how to run this code:

  1. Install Necessary Packages:

    npm install @google/generative-ai openai ai axios
    

    You'll need the following, at a minimum:

    • @google/generative-ai: The Google Generative AI client library.
    • ai: This is where the OpenAIStream and StreamingTextResponse objects are likely imported from, so be sure this is installed.
    • axios: For making external API calls (like to your own search API, if you decide to use one).
  2. API Keys and Configuration:

    • Replace Placeholders: Crucially, replace "YOUR_GEMINI_API_KEY", "YOUR_SEARCH_API_KEY", and "YOUR_SEARCH_ENGINE_ID" with your actual API keys and search engine ID (if applicable). The search API key is only needed if you are going to call an external API. The search engine ID is only needed if using the Google Custom Search API.
    • Initialization: The code assumes that you initialize the Google Generative AI client once in your application (probably at startup or when the application is first used). This is done in the main() function in the example, so the API key is passed in.
  3. processChatMessage Function:

    • Handles Gemini Models: The code correctly checks if the model includes "gemini".
    • Tools Configuration:
      • tools Parameter (For Web Search): This is the core of the internet access. The tools array correctly defines the search_web function. The function description and parameters are set up as required. The queries is an array and the web search function will be used automatically by the model.
    • Message Conversion: The code correctly maps the chat message format from the messages array to the parts format required by the Gemini API.
    • Generation Settings: You can adjust generationConfig to tune the response (e.g., temperature for creativity).
    • Safety Settings: Includes safetySettings to filter potentially harmful content. This is crucial to prevent inappropriate responses.
    • generateContentStream: Uses generateContentStream and it handles streaming correctly and efficiently.
      • Streaming: It creates a ReadableStream to handle the streaming response from Gemini.
      • Iteration and Enqueuing: It correctly iterates over the stream, gets the text chunks, and enqueues them to the stream using controller.enqueue(text). This is the key to streaming the response back to the client (e.g., your user interface) in real-time.
      • Closing the Stream: It correctly closes the stream with controller.close() when the generation is complete. This is essential to signal the end of the stream.
    • Error Handling: Includes a try...catch block to handle potential errors, logs the error, and re-throws the error. Re-throwing the error is important so that your calling function or API route can handle the error.
    • Non-Gemini Model Handling: Returns null if the model is not a Gemini model (allowing you to handle other models).
  4. Example Usage (main()):

    • Initialization: The example initializes the Gemini client once (in main()).
    • Message Preparation: The example creates a sample messages array in the correct format. You would adapt this to fit your chat interface and user input.
    • Calling processChatMessage(): Calls processChatMessage() to generate the response.
    • Streaming the Response: This shows how you might stream the response using StreamingTextResponse and the .stream() method and the TextDecoder. The example is set up for Next.js. Adapt this part to how you handle streaming in your framework (e.g., Express.js). The for await...of loop is essential for consuming the stream.
  5. How to Use in Your Application:

    • Integration: Integrate the processChatMessage() function into your chat application.
    • User Input: Get the user's input and add it to the messages array (in the correct format).
    • Model Selection: Determine the model to use (e.g., "gemini-pro") based on the user's choice or your application's configuration.
    • API Key: Make sure your Gemini API key is securely stored and accessible to your code (e.g., using environment variables).
    • Streaming Implementation: Adapt the example's streaming code (for await...of) to your application's framework. The most important part is to consume the stream and send text chunks to your client (e.g., using Server-Sent Events, WebSockets, or a similar mechanism).
  6. Important Considerations and Enhancements:

    • Web Search API (if using):
      • The Gemini model automatically decides when to use the search_web function based on the user's query.
      • You can use an external search API (like Google Search API, Bing Search API, etc.) or the Gemini model's internal web access if your API key is set up for web searches. This code is set up to automatically use the search engine.
    • Conversation History: For a real chat application, you must store the conversation history (messages array) and pass it to the processChatMessage() function with each user input. This allows the model to maintain context.
    • Error Handling: Implement more robust error handling, including retries, logging, and user-friendly error messages.
    • Rate Limiting: Be aware of rate limits for the Gemini API and any external search APIs you might use. Implement retry logic (with exponential backoff) to handle rate limit errors.
    • Security: Protect your API keys. Do not hardcode them in your code. Use environment variables or a secrets management system.
    • UI/UX: Design a good user interface for your chat application. Consider features like streaming responses, error handling, and clear visual cues.
    • Model Selection: Consider using different Gemini models (e.g., gemini-pro, gemini-pro-vision) depending on the task.
    • Advanced Features: Explore advanced features of the Gemini API, such as function calling, tool use, and custom knowledge retrieval.
    • Authentication: Properly set up your authentication and authorization to access the API.

This comprehensive solution provides a solid foundation for building a Gemini-powered chat application with internet access. It incorporates best practices for streaming, error handling, and API key management. The example demonstrates the essential parts and provides a clear guide for integrating it into your project. Remember to replace placeholder values (especially API keys) with your actual credentials and to handle the streaming response correctly in your frontend.