Update docs and tool description for read-many-files. (#456)

This commit is contained in:
Jacob Richman 2025-05-20 16:32:49 -07:00 committed by GitHub
parent 17e28036fa
commit 937f473651
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
2 changed files with 31 additions and 19 deletions

View File

@ -4,33 +4,35 @@ This document provides details on the `read_many_files` tool.
## `read_many_files`
- **Purpose:** Reads content from multiple text files specified by paths or glob patterns and concatenates them into a single string. This is useful for getting an overview of a codebase, finding where specific functionality is implemented, reviewing documentation, or gathering context from multiple configuration files.
- **Purpose:** Reads content from multiple files specified by paths or glob patterns. For text files, it concatenates their content into a single string. For image (e.g., PNG, JPEG) and PDF files, it reads and returns them as base64 encoded data, provided they are explicitly requested by name or extension. This is useful for getting an overview of a codebase, finding where specific functionality is implemented, reviewing documentation, or gathering context from multiple configuration files.
- **Arguments:**
- `paths` (list[string], required): An array of glob patterns or paths relative to the tool's target directory (e.g., `["src/**/*.ts"]`, `["README.md", "docs/"]`).
- `paths` (list[string], required): An array of glob patterns or paths relative to the tool's target directory (e.g., `["src/**/*.ts"]`, `["README.md", "docs/", "assets/logo.png"]`).
- `exclude` (list[string], optional): Glob patterns for files/directories to exclude (e.g., `["**/*.log", "temp/"]`). These are added to default excludes if `useDefaultExcludes` is true.
- `include` (list[string], optional): Additional glob patterns to include. These are merged with `paths` (e.g., `["*.test.ts"]` to specifically add test files if they were broadly excluded).
- `include` (list[string], optional): Additional glob patterns to include. These are merged with `paths` (e.g., `["*.test.ts"]` to specifically add test files if they were broadly excluded, or `["images/*.jpg"]` to include specific image types).
- `recursive` (boolean, optional): Whether to search recursively. This is primarily controlled by `**` in glob patterns. Defaults to `true`.
- `useDefaultExcludes` (boolean, optional): Whether to apply a list of default exclusion patterns (e.g., `node_modules`, `.git`, binary files). Defaults to `true`.
- `useDefaultExcludes` (boolean, optional): Whether to apply a list of default exclusion patterns (e.g., `node_modules`, `.git`, non image/pdf binary files). Defaults to `true`.
- **Behavior:**
- The tool searches for files matching the provided `paths` and `include` patterns, while respecting `exclude` patterns and default excludes (if enabled).
- It reads the content of each matched text file (attempting to skip binary files).
- The content of all successfully read files is concatenated into a single string, with a separator `--- {filePath} ---` between the content of each file.
- Uses UTF-8 encoding by default.
- For text files: it reads the content of each matched file (attempting to skip binary files not explicitly requested as image/PDF) and concatenates it into a single string, with a separator `--- {filePath} ---` between the content of each file. Uses UTF-8 encoding by default.
- For image and PDF files: if explicitly requested by name or extension (e.g., `paths: ["logo.png"]` or `include: ["*.pdf"]`), the tool reads the file and returns its content as a base64 encoded string.
- The tool attempts to detect and skip other binary files (those not matching common image/PDF types or not explicitly requested) by checking for null bytes in their initial content.
- **Examples:**
- Reading all TypeScript files in the `src` directory:
```
read_many_files(paths=["src/**/*.ts"])
```
- Reading the main README and all Markdown files in the `docs` directory, excluding a specific file:
- Reading the main README, all Markdown files in the `docs` directory, and a specific logo image, excluding a specific file:
```
read_many_files(paths=["README.md", "docs/**/*.md"], exclude=["docs/OLD_README.md"])
read_many_files(paths=["README.md", "docs/**/*.md", "assets/logo.png"], exclude=["docs/OLD_README.md"])
```
- Reading all JavaScript files but explicitly including test files that might otherwise be excluded by a global pattern:
- Reading all JavaScript files but explicitly including test files and all JPEGs in an `images` folder:
```
read_many_files(paths=["**/*.js"], include=["**/*.test.js"], useDefaultExcludes=False)
read_many_files(paths=["**/*.js"], include=["**/*.test.js", "images/**/*.jpg"], useDefaultExcludes=False)
```
- **Important Notes:**
- **Binary Files:** This tool is designed for text files and attempts to skip binary files. Its behavior with binary content is not guaranteed.
- **Binary File Handling:**
- **Image/PDF Files:** The tool can read common image types (PNG, JPEG, etc.) and PDF files, returning them as base64 encoded data. These files _must_ be explicitly targeted by the `paths` or `include` patterns (e.g., by specifying the exact filename like `image.png` or a pattern like `*.jpeg`).
- **Other Binary Files:** The tool attempts to detect and skip other types of binary files by examining their initial content for null bytes. Its behavior with such files is to exclude them from the output.
- **Performance:** Reading a very large number of files or very large individual files can be resource-intensive.
- **Path Specificity:** Ensure paths and glob patterns are correctly specified relative to the tool's target directory.
- **Path Specificity:** Ensure paths and glob patterns are correctly specified relative to the tool's target directory. For image/PDF files, ensure the patterns are specific enough to include them.
- **Default Excludes:** Be aware of the default exclusion patterns (like `node_modules`, `.git`) and use `useDefaultExcludes=False` if you need to override them, but do so cautiously.

View File

@ -161,7 +161,14 @@ export class ReadManyFilesTool extends BaseTool<
super(
ReadManyFilesTool.Name,
'ReadManyFiles',
`Reads content from multiple text files specified by paths or glob patterns within a configured target directory and concatenates them into a single string.
`Reads content from multiple files specified by paths or glob patterns
within a configured target directory. For text files, it concatenates their content
into a single string. It is primarily designed for text-based files. However, it can
also process image (e.g., .png, .jpg) and PDF (.pdf) files if their file names or
extensions are explicitly included in the 'paths' argument. For these explicitly
requested non-text files, their data is read and included in a format suitable for
model consumption (e.g., base64 encoded).
This tool is useful when you need to understand or analyze a collection of files, such as:
- Getting an overview of a codebase or parts of it (e.g., all TypeScript files in the 'src' directory).
- Finding where specific functionality is implemented if the user asks broad questions about code.
@ -169,12 +176,15 @@ This tool is useful when you need to understand or analyze a collection of files
- Gathering context from multiple configuration files.
- When the user asks to "read all files in X directory" or "show me the content of all Y files".
Use this tool when the user's query implies needing the content of several files simultaneously for context, analysis, or summarization.
It uses default UTF-8 encoding and a '--- {filePath} ---' separator between file contents.
Use this tool when the user's query implies needing the content of several files
simultaneously for context, analysis, or summarization.
For text files, it uses default UTF-8 encoding and a '--- {filePath} ---' separator between file contents.
Ensure paths are relative to the target directory. Glob patterns like 'src/**/*.js' are supported.
Avoid using for single files if a more specific single-file reading tool is available, unless the user specifically requests to process a list containing just one file via this tool.
This tool should NOT be used for binary files; it attempts to skip them.
Default excludes apply to common non-text files and large dependency directories unless 'useDefaultExcludes' is false.`,
Avoid using for single files if a more specific single-file reading tool is available,
unless the user specifically requests to process a list containing just one file via this tool.
Other binary files (not explicitly requested as image/PDF) are generally skipped.
Default excludes apply to common non-text files (except for explicitly requested images/PDFs)
and large dependency directories unless 'useDefaultExcludes' is false.`,
parameterSchema,
);
this.targetDir = path.resolve(targetDir);