From 96886f18ac81ba9908c1104dd0c158f4c8d03757 Mon Sep 17 00:00:00 2001 From: YeonGyu-Kim Date: Sat, 13 Dec 2025 15:26:44 +0900 Subject: [PATCH] docs: add look_at tool and multimodal-looker agent documentation MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ๐Ÿค– GENERATED WITH ASSISTANCE OF [OhMyOpenCode](https://github.com/code-yeongyu/oh-my-opencode) --- README.ko.md | 7 +++++++ README.md | 7 +++++++ 2 files changed, 14 insertions(+) diff --git a/README.ko.md b/README.ko.md index 5f74663..7727081 100644 --- a/README.ko.md +++ b/README.ko.md @@ -218,6 +218,7 @@ OpenCode ๋Š” ์•„์ฃผ ํ™•์žฅ๊ฐ€๋Šฅํ•˜๊ณ  ์•„์ฃผ ์ปค์Šคํ„ฐ๋งˆ์ด์ €๋ธ”ํ•ฉ๋‹ˆ๋‹ค. - **explore** (`opencode/grok-code`): ๋น ๋ฅธ ์ฝ”๋“œ๋ฒ ์ด์Šค ํƒ์ƒ‰, ํŒŒ์ผ ํŒจํ„ด ๋งค์นญ. Claude Code๋Š” Haiku๋ฅผ ์“ฐ์ง€๋งŒ, ์šฐ๋ฆฌ๋Š” Grok์„ ์”๋‹ˆ๋‹ค. ํ˜„์žฌ ๋ฌด๋ฃŒ์ด๊ณ , ๊ทน๋„๋กœ ๋น ๋ฅด๋ฉฐ, ํŒŒ์ผ ํƒ์ƒ‰ ์ž‘์—…์— ์ถฉ๋ถ„ํ•œ ์ง€๋Šฅ์„ ๊ฐ–์ท„๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค. Claude Code ์—์„œ ์˜๊ฐ์„ ๋ฐ›์•˜์Šต๋‹ˆ๋‹ค. - **frontend-ui-ux-engineer** (`google/gemini-3-pro-preview`): ๊ฐœ๋ฐœ์ž๋กœ ์ „ํ–ฅํ•œ ๋””์ž์ด๋„ˆ๋ผ๋Š” ์„ค์ •์„ ๊ฐ–๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ๋ฉ‹์ง„ UI๋ฅผ ๋งŒ๋“ญ๋‹ˆ๋‹ค. ์•„๋ฆ„๋‹ต๊ณ  ์ฐฝ์˜์ ์ธ UI ์ฝ”๋“œ๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๋ฐ ํƒ์›”ํ•œ Gemini๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. - **document-writer** (`google/gemini-3-pro-preview`): ๊ธฐ์ˆ  ๋ฌธ์„œ ์ „๋ฌธ๊ฐ€๋ผ๋Š” ์„ค์ •์„ ๊ฐ–๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. Gemini ๋Š” ๋ฌธํ•™๊ฐ€์ž…๋‹ˆ๋‹ค. ๊ธ€์„ ๊ธฐ๊ฐ€๋ง‰ํžˆ๊ฒŒ ์”๋‹ˆ๋‹ค. +- **multimodal-looker** (`google/gemini-2.5-flash`): ์‹œ๊ฐ์  ์ฝ˜ํ…์ธ  ํ•ด์„์„ ์œ„ํ•œ ์ „๋ฌธ ์—์ด์ „ํŠธ. PDF, ์ด๋ฏธ์ง€, ๋‹ค์ด์–ด๊ทธ๋žจ์„ ๋ถ„์„ํ•˜์—ฌ ์ •๋ณด๋ฅผ ์ถ”์ถœํ•ฉ๋‹ˆ๋‹ค. ๊ฐ ์—์ด์ „ํŠธ๋Š” ๋ฉ”์ธ ์—์ด์ „ํŠธ๊ฐ€ ์•Œ์•„์„œ ํ˜ธ์ถœํ•˜์ง€๋งŒ, ๋ช…์‹œ์ ์œผ๋กœ ์š”์ฒญํ•  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค: @@ -270,6 +271,12 @@ OpenCode ๋Š” ์•„์ฃผ ํ™•์žฅ๊ฐ€๋Šฅํ•˜๊ณ  ์•„์ฃผ ์ปค์Šคํ„ฐ๋งˆ์ด์ €๋ธ”ํ•ฉ๋‹ˆ๋‹ค. - ๊ธฐ๋ณธ `glob`์€ ํƒ€์ž„์•„์›ƒ์ด ์—†์Šต๋‹ˆ๋‹ค. ripgrep์ด ๋ฉˆ์ถ”๋ฉด ๋ฌดํ•œ์ • ๋Œ€๊ธฐํ•ฉ๋‹ˆ๋‹ค. - ์ด ๋„๊ตฌ๋Š” ํƒ€์ž„์•„์›ƒ์„ ๊ฐ•์ œํ•˜๊ณ  ๋งŒ๋ฃŒ ์‹œ ํ”„๋กœ์„ธ์Šค๋ฅผ ์ข…๋ฃŒํ•ฉ๋‹ˆ๋‹ค. +#### ๋‚ด์žฅ ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ๋„๊ตฌ (Built-in Multimodal Tools) + +- **look_at**: ์‹œ๊ฐ์  ํ•ด์„์ด ํ•„์š”ํ•œ ๋ฏธ๋””์–ด ํŒŒ์ผ(PDF, ์ด๋ฏธ์ง€, ๋‹ค์ด์–ด๊ทธ๋žจ ๋“ฑ)์„ Gemini 2.5 Flash๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ถ„์„ํ•ฉ๋‹ˆ๋‹ค. Sourcegraph Ampcode์˜ `look_at` ๋„๊ตฌ์—์„œ ์˜๊ฐ์„ ๋ฐ›์•˜์Šต๋‹ˆ๋‹ค. + - ํŒŒ๋ผ๋ฏธํ„ฐ: `file_path` (์ ˆ๋Œ€ ๊ฒฝ๋กœ), `goal` (์ถ”์ถœํ•  ์ •๋ณด) + - ์‚ฌ์šฉ ์‚ฌ๋ก€: PDF ํ…์ŠคํŠธ ์ถ”์ถœ, ์ด๋ฏธ์ง€ ์„ค๋ช…, ๋‹ค์ด์–ด๊ทธ๋žจ ๋ถ„์„ + #### ๋‚ด์žฅ MCPs - **websearch_exa**: Exa AI ์›น ๊ฒ€์ƒ‰. ์‹ค์‹œ๊ฐ„ ์›น ๊ฒ€์ƒ‰๊ณผ ์ฝ˜ํ…์ธ  ์Šคํฌ๋ž˜ํ•‘์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค. ๊ด€๋ จ ์›น์‚ฌ์ดํŠธ์—์„œ LLM์— ์ตœ์ ํ™”๋œ ์ปจํ…์ŠคํŠธ๋ฅผ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค. diff --git a/README.md b/README.md index 76d6b65..82da0cd 100644 --- a/README.md +++ b/README.md @@ -215,6 +215,7 @@ I believe in the right tool for the job. For your wallet's sake, use CLIProxyAPI - **explore** (`opencode/grok-code`): Fast exploration and pattern matching. Claude Code uses Haiku; we use Grok. It is currently free, blazing fast, and intelligent enough for file traversal. Inspired by Claude Code. - **frontend-ui-ux-engineer** (`google/gemini-3-pro-preview`): A designer turned developer. Creates stunning UIs. Uses Gemini because its creativity and UI code generation are superior. - **document-writer** (`google/gemini-3-pro-preview`): A technical writing expert. Gemini is a wordsmith; it writes prose that flows naturally. +- **multimodal-looker** (`google/gemini-2.5-flash`): Specialized agent for visual content interpretation. Analyzes PDFs, images, and diagrams to extract information. Each agent is automatically invoked by the main agent, but you can also explicitly request them: @@ -269,6 +270,12 @@ The features you use in your editorโ€”other agents cannot access them. Oh My Ope - The default `glob` lacks timeout. If ripgrep hangs, it waits indefinitely. - This tool enforces timeouts and kills the process on expiration. +#### Built-in Multimodal Tools + +- **look_at**: Analyzes media files (PDFs, images, diagrams) that require visual interpretation using Gemini 2.5 Flash. Inspired by Sourcegraph Ampcode's `look_at` tool. + - Parameters: `file_path` (absolute path), `goal` (what to extract) + - Use cases: PDF text extraction, image description, diagram analysis + #### Built-in MCPs - **websearch_exa**: Exa AI web search. Performs real-time web searches and can scrape content from specific URLs. Returns LLM-optimized context from relevant websites.