多模态

MAF 支持将图像传递给 Agent，使 Agent 能够分析和响应图像内容。

将图像传递到 Agent

通过创建包含文本和图像内容的 ChatMessage 来将图像发送给 Agent。

创建 Agent

csharp

AIAgent agent = new AIProjectClient(
    new Uri("<your-foundry-project-endpoint>"),
    new DefaultAzureCredential())
    .AsAIAgent(
        model: "gpt-4o",
        name: "VisionAgent",
        instructions: "You are a helpful agent that can analyze images");

构建多模态消息

使用 TextContent 代表文本，UriContent 代表图像：

csharp

ChatMessage message = new(ChatRole.User, [
    new TextContent("What do you see in this image?"),
    new UriContent(
        "https://example.com/image.jpg",
        "image/jpeg")
]);

运行 Agent

csharp

Console.WriteLine(await agent.RunAsync(message));

Agent 会分析图像并返回描述或回答图像相关的问题。

注意事项

多模态依赖于底层模型的能力（如 GPT-4o），并非所有模型都支持图像输入
图像格式支持取决于 LLM 提供商的具体实现
可以同时向 Agent 传递多张图像

下一步：结构化输出 — 获取类型安全的 Agent 响应。

多模态 ​

将图像传递到 Agent ​

创建 Agent ​

构建多模态消息 ​

运行 Agent ​

注意事项 ​

多模态

将图像传递到 Agent

创建 Agent

构建多模态消息

运行 Agent

注意事项