ChatGoogleGenerativeAI
You can access Google's gemini and gemini-vision models, as well as other
generative models in LangChain through ChatGoogleGenerativeAI class in the
@langchain/google-genai integration package.
You can also access Google's gemini family of models via the LangChain VertexAI and VertexAI-web integrations.
Click here to read the docs.
Get an API key here: https://ai.google.dev/tutorials/setup
You'll first need to install the @langchain/google-genai package:
- npm
 - Yarn
 - pnpm
 
npm install @langchain/google-genai
yarn add @langchain/google-genai
pnpm add @langchain/google-genai
Usage
We're unifying model params across all packages. We now suggest using model instead of modelName, and apiKey for API keys.
import { ChatGoogleGenerativeAI } from "@langchain/google-genai";
import { HarmBlockThreshold, HarmCategory } from "@google/generative-ai";
/*
 * Before running this, you should make sure you have created a
 * Google Cloud Project that has `generativelanguage` API enabled.
 *
 * You will also need to generate an API key and set
 * an environment variable GOOGLE_API_KEY
 *
 */
// Text
const model = new ChatGoogleGenerativeAI({
  model: "gemini-pro",
  maxOutputTokens: 2048,
  safetySettings: [
    {
      category: HarmCategory.HARM_CATEGORY_HARASSMENT,
      threshold: HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,
    },
  ],
});
// Batch and stream are also supported
const res = await model.invoke([
  [
    "human",
    "What would be a good company name for a company that makes colorful socks?",
  ],
]);
console.log(res);
/*
  AIMessage {
    content: '1. Rainbow Soles\n' +
      '2. Toe-tally Colorful\n' +
      '3. Bright Sock Creations\n' +
      '4. Hue Knew Socks\n' +
      '5. The Happy Sock Factory\n' +
      '6. Color Pop Hosiery\n' +
      '7. Sock It to Me!\n' +
      '8. Mismatched Masterpieces\n' +
      '9. Threads of Joy\n' +
      '10. Funky Feet Emporium\n' +
      '11. Colorful Threads\n' +
      '12. Sole Mates\n' +
      '13. Colorful Soles\n' +
      '14. Sock Appeal\n' +
      '15. Happy Feet Unlimited\n' +
      '16. The Sock Stop\n' +
      '17. The Sock Drawer\n' +
      '18. Sole-diers\n' +
      '19. Footloose Footwear\n' +
      '20. Step into Color',
    name: 'model',
    additional_kwargs: {}
  }
*/
API Reference:
- ChatGoogleGenerativeAI from 
@langchain/google-genai 
Tool calling
import { StructuredTool } from "@langchain/core/tools";
import { ChatGoogleGenerativeAI } from "@langchain/google-genai";
import { z } from "zod";
const model = new ChatGoogleGenerativeAI({
  model: "gemini-pro",
});
// Define your tool
class FakeBrowserTool extends StructuredTool {
  schema = z.object({
    url: z.string(),
    query: z.string().optional(),
  });
  name = "fake_browser_tool";
  description =
    "useful for when you need to find something on the web or summarize a webpage.";
  async _call(_: z.infer<this["schema"]>): Promise<string> {
    return "fake_browser_tool";
  }
}
// Bind your tools to the model
const modelWithTools = model.bind({
  tools: [new FakeBrowserTool()],
});
// Or, you can use `.bindTools` which works the same under the hood
// const modelWithTools = model.bindTools([new FakeBrowserTool()]);
const res = await modelWithTools.invoke([
  [
    "human",
    "Search the web and tell me what the weather will be like tonight in new york. use a popular weather website",
  ],
]);
console.log(res.tool_calls);
/*
[
  {
    name: 'fake_browser_tool',
    args: {
      query: 'weather in new york',
      url: 'https://www.google.com/search?q=weather+in+new+york'
    }
  }
]
*/
API Reference:
- StructuredTool from 
@langchain/core/tools - ChatGoogleGenerativeAI from 
@langchain/google-genai 
See the above run's LangSmith trace here
.withStructuredOutput
import { StructuredTool } from "@langchain/core/tools";
import { ChatGoogleGenerativeAI } from "@langchain/google-genai";
import { z } from "zod";
const model = new ChatGoogleGenerativeAI({
  model: "gemini-pro",
});
// Define your tool
class FakeBrowserTool extends StructuredTool {
  schema = z.object({
    url: z.string(),
    query: z.string().optional(),
  });
  name = "fake_browser_tool";
  description =
    "useful for when you need to find something on the web or summarize a webpage.";
  async _call(_: z.infer<this["schema"]>): Promise<string> {
    return "fake_browser_tool";
  }
}
const tool = new FakeBrowserTool();
// Bind your tools to the model
const modelWithTools = model.withStructuredOutput(tool.schema, {
  name: tool.name, // this is optional
});
// Optionally, you can pass just a Zod schema, or JSONified Zod schema
// const modelWithTools = model.withStructuredOutput(
//   zodSchema,
// );
const res = await modelWithTools.invoke([
  [
    "human",
    "Search the web and tell me what the weather will be like tonight in new york. use a popular weather website",
  ],
]);
console.log(res);
/*
{
  url: 'https://www.accuweather.com/en/us/new-york-ny/10007/night-weather-forecast/349014',
  query: 'weather tonight'
}
*/
API Reference:
- StructuredTool from 
@langchain/core/tools - ChatGoogleGenerativeAI from 
@langchain/google-genai 
See the above run's LangSmith trace here
Multimodal support
To provide an image, pass a human message with a content field set to an array of content objects. Each content object
where each dict contains either an image value (type of image_url) or a text (type of text) value. The value of image_url must be a base64
encoded image (e.g., data:image/png;base64,abcd124):
import fs from "fs";
import { ChatGoogleGenerativeAI } from "@langchain/google-genai";
import { HumanMessage } from "@langchain/core/messages";
// Multi-modal
const vision = new ChatGoogleGenerativeAI({
  model: "gemini-pro-vision",
  maxOutputTokens: 2048,
});
const image = fs.readFileSync("./hotdog.jpg").toString("base64");
const input2 = [
  new HumanMessage({
    content: [
      {
        type: "text",
        text: "Describe the following image.",
      },
      {
        type: "image_url",
        image_url: `data:image/png;base64,${image}`,
      },
    ],
  }),
];
const res2 = await vision.invoke(input2);
console.log(res2);
/*
  AIMessage {
    content: ' The image shows a hot dog in a bun. The hot dog is grilled and has a dark brown color. The bun is toasted and has a light brown color. The hot dog is in the center of the bun.',
    name: 'model',
    additional_kwargs: {}
  }
*/
// Multi-modal streaming
const res3 = await vision.stream(input2);
for await (const chunk of res3) {
  console.log(chunk);
}
/*
  AIMessageChunk {
    content: ' The image shows a hot dog in a bun. The hot dog is grilled and has grill marks on it. The bun is toasted and has a light golden',
    name: 'model',
    additional_kwargs: {}
  }
  AIMessageChunk {
    content: ' brown color. The hot dog is in the center of the bun.',
    name: 'model',
    additional_kwargs: {}
  }
*/
API Reference:
- ChatGoogleGenerativeAI from 
@langchain/google-genai - HumanMessage from 
@langchain/core/messages 
Gemini Prompting FAQs
As of the time this doc was written (2023/12/12), Gemini has some restrictions on the types and structure of prompts it accepts. Specifically:
- When providing multimodal (image) inputs, you are restricted to at most 1 message of "human" (user) type. You cannot pass multiple messages (though the single human message may have multiple content entries)
 - System messages are not natively supported, and will be merged with the first human message if present.
 - For regular chat conversations, messages must follow the human/ai/human/ai alternating pattern. You may not provide 2 AI or human messages in sequence.
 - Message may be blocked if they violate the safety checks of the LLM. In this case, the model will return an empty response.