fix: CLAUDE.md compatibility for GEMINI.md '@' file import behavior (#2978)
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Allen Hutchison <adh@google.com>
This commit is contained in:
parent
ae86c7ba05
commit
9a6422f331
|
@ -1,18 +1,14 @@
|
|||
# Memory Import Processor
|
||||
|
||||
The Memory Import Processor is a feature that allows you to modularize your GEMINI.md files by importing content from other markdown files using the `@file.md` syntax.
|
||||
The Memory Import Processor is a feature that allows you to modularize your GEMINI.md files by importing content from other files using the `@file.md` syntax.
|
||||
|
||||
## Overview
|
||||
|
||||
This feature enables you to break down large GEMINI.md files into smaller, more manageable components that can be reused across different contexts. The import processor supports both relative and absolute paths, with built-in safety features to prevent circular imports and ensure file access security.
|
||||
|
||||
## Important Limitations
|
||||
|
||||
**This feature only supports `.md` (markdown) files.** Attempting to import files with other extensions (like `.txt`, `.json`, etc.) will result in a warning and the import will fail.
|
||||
|
||||
## Syntax
|
||||
|
||||
Use the `@` symbol followed by the path to the markdown file you want to import:
|
||||
Use the `@` symbol followed by the path to the file you want to import:
|
||||
|
||||
```markdown
|
||||
# Main GEMINI.md file
|
||||
|
@ -96,24 +92,10 @@ The `validateImportPath` function ensures that imports are only allowed from spe
|
|||
|
||||
### Maximum Import Depth
|
||||
|
||||
To prevent infinite recursion, there's a configurable maximum import depth (default: 10 levels).
|
||||
To prevent infinite recursion, there's a configurable maximum import depth (default: 5 levels).
|
||||
|
||||
## Error Handling
|
||||
|
||||
### Non-MD File Attempts
|
||||
|
||||
If you try to import a non-markdown file, you'll see a warning:
|
||||
|
||||
```markdown
|
||||
@./instructions.txt <!-- This will show a warning and fail -->
|
||||
```
|
||||
|
||||
Console output:
|
||||
|
||||
```
|
||||
[WARN] [ImportProcessor] Import processor only supports .md files. Attempting to import non-md file: ./instructions.txt. This will fail.
|
||||
```
|
||||
|
||||
### Missing Files
|
||||
|
||||
If a referenced file doesn't exist, the import will fail gracefully with an error comment in the output.
|
||||
|
@ -122,6 +104,36 @@ If a referenced file doesn't exist, the import will fail gracefully with an erro
|
|||
|
||||
Permission issues or other file system errors are handled gracefully with appropriate error messages.
|
||||
|
||||
## Code Region Detection
|
||||
|
||||
The import processor uses the `marked` library to detect code blocks and inline code spans, ensuring that `@` imports inside these regions are properly ignored. This provides robust handling of nested code blocks and complex Markdown structures.
|
||||
|
||||
## Import Tree Structure
|
||||
|
||||
The processor returns an import tree that shows the hierarchy of imported files, similar to Claude's `/memory` feature. This helps users debug problems with their GEMINI.md files by showing which files were read and their import relationships.
|
||||
|
||||
Example tree structure:
|
||||
|
||||
```
|
||||
Memory Files
|
||||
L project: GEMINI.md
|
||||
L a.md
|
||||
L b.md
|
||||
L c.md
|
||||
L d.md
|
||||
L e.md
|
||||
L f.md
|
||||
L included.md
|
||||
```
|
||||
|
||||
The tree preserves the order that files were imported and shows the complete import chain for debugging purposes.
|
||||
|
||||
## Comparison to Claude Code's `/memory` (`claude.md`) Approach
|
||||
|
||||
Claude Code's `/memory` feature (as seen in `claude.md`) produces a flat, linear document by concatenating all included files, always marking file boundaries with clear comments and path names. It does not explicitly present the import hierarchy, but the LLM receives all file contents and paths, which is sufficient for reconstructing the hierarchy if needed.
|
||||
|
||||
Note: The import tree is mainly for clarity during development and has limited relevance to LLM consumption.
|
||||
|
||||
## API Reference
|
||||
|
||||
### `processImports(content, basePath, debugMode?, importState?)`
|
||||
|
@ -135,7 +147,25 @@ Processes import statements in GEMINI.md content.
|
|||
- `debugMode` (boolean, optional): Whether to enable debug logging (default: false)
|
||||
- `importState` (ImportState, optional): State tracking for circular import prevention
|
||||
|
||||
**Returns:** Promise<string> - Processed content with imports resolved
|
||||
**Returns:** Promise<ProcessImportsResult> - Object containing processed content and import tree
|
||||
|
||||
### `ProcessImportsResult`
|
||||
|
||||
```typescript
|
||||
interface ProcessImportsResult {
|
||||
content: string; // The processed content with imports resolved
|
||||
importTree: MemoryFile; // Tree structure showing the import hierarchy
|
||||
}
|
||||
```
|
||||
|
||||
### `MemoryFile`
|
||||
|
||||
```typescript
|
||||
interface MemoryFile {
|
||||
path: string; // The file path
|
||||
imports?: MemoryFile[]; // Direct imports, in the order they were imported
|
||||
}
|
||||
```
|
||||
|
||||
### `validateImportPath(importPath, basePath, allowedDirectories)`
|
||||
|
||||
|
@ -149,6 +179,16 @@ Validates import paths to ensure they are safe and within allowed directories.
|
|||
|
||||
**Returns:** boolean - Whether the import path is valid
|
||||
|
||||
### `findProjectRoot(startDir)`
|
||||
|
||||
Finds the project root by searching for a `.git` directory upwards from the given start directory. Implemented as an **async** function using non-blocking file system APIs to avoid blocking the Node.js event loop.
|
||||
|
||||
**Parameters:**
|
||||
|
||||
- `startDir` (string): The directory to start searching from
|
||||
|
||||
**Returns:** Promise<string> - The project root directory (or the start directory if no `.git` is found)
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Use descriptive file names** for imported components
|
||||
|
@ -161,7 +201,7 @@ Validates import paths to ensure they are safe and within allowed directories.
|
|||
|
||||
### Common Issues
|
||||
|
||||
1. **Import not working**: Check that the file exists and has a `.md` extension
|
||||
1. **Import not working**: Check that the file exists and the path is correct
|
||||
2. **Circular import warnings**: Review your import structure for circular references
|
||||
3. **Permission errors**: Ensure the files are readable and within allowed directories
|
||||
4. **Path resolution issues**: Use absolute paths if relative paths aren't resolving correctly
|
||||
|
|
|
@ -14,6 +14,7 @@
|
|||
"gemini": "bundle/gemini.js"
|
||||
},
|
||||
"devDependencies": {
|
||||
"@types/marked": "^5.0.2",
|
||||
"@types/micromatch": "^4.0.9",
|
||||
"@types/mime-types": "^3.0.1",
|
||||
"@types/minimatch": "^5.1.2",
|
||||
|
@ -2338,6 +2339,13 @@
|
|||
"dev": true,
|
||||
"license": "MIT"
|
||||
},
|
||||
"node_modules/@types/marked": {
|
||||
"version": "5.0.2",
|
||||
"resolved": "https://registry.npmjs.org/@types/marked/-/marked-5.0.2.tgz",
|
||||
"integrity": "sha512-OucS4KMHhFzhz27KxmWg7J+kIYqyqoW5kdIEI319hqARQQUTqhao3M/F+uFnDXD0Rg72iDDZxZNxq5gvctmLlg==",
|
||||
"dev": true,
|
||||
"license": "MIT"
|
||||
},
|
||||
"node_modules/@types/micromatch": {
|
||||
"version": "4.0.9",
|
||||
"resolved": "https://registry.npmjs.org/@types/micromatch/-/micromatch-4.0.9.tgz",
|
||||
|
@ -7687,6 +7695,18 @@
|
|||
"node": ">=10"
|
||||
}
|
||||
},
|
||||
"node_modules/marked": {
|
||||
"version": "15.0.12",
|
||||
"resolved": "https://registry.npmjs.org/marked/-/marked-15.0.12.tgz",
|
||||
"integrity": "sha512-8dD6FusOQSrpv9Z1rdNMdlSgQOIP880DHqnohobOmYLElGEqAL/JvxvuxZO16r4HtjTlfPRDC1hbvxC9dPN2nA==",
|
||||
"license": "MIT",
|
||||
"bin": {
|
||||
"marked": "bin/marked.js"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">= 18"
|
||||
}
|
||||
},
|
||||
"node_modules/math-intrinsics": {
|
||||
"version": "1.1.0",
|
||||
"resolved": "https://registry.npmjs.org/math-intrinsics/-/math-intrinsics-1.1.0.tgz",
|
||||
|
@ -11861,6 +11881,7 @@
|
|||
"html-to-text": "^9.0.5",
|
||||
"https-proxy-agent": "^7.0.6",
|
||||
"ignore": "^7.0.0",
|
||||
"marked": "^15.0.12",
|
||||
"micromatch": "^4.0.8",
|
||||
"open": "^10.1.2",
|
||||
"shell-quote": "^1.8.3",
|
||||
|
|
|
@ -58,6 +58,7 @@
|
|||
"LICENSE"
|
||||
],
|
||||
"devDependencies": {
|
||||
"@types/marked": "^5.0.2",
|
||||
"@types/micromatch": "^4.0.9",
|
||||
"@types/mime-types": "^3.0.1",
|
||||
"@types/minimatch": "^5.1.2",
|
||||
|
|
|
@ -494,6 +494,7 @@ describe('Hierarchical Memory Loading (config.ts) - Placeholder Suite', () => {
|
|||
'/path/to/ext3/context1.md',
|
||||
'/path/to/ext3/context2.md',
|
||||
],
|
||||
'tree',
|
||||
{
|
||||
respectGitIgnore: false,
|
||||
respectGeminiIgnore: true,
|
||||
|
|
|
@ -59,6 +59,7 @@ export interface CliArgs {
|
|||
experimentalAcp: boolean | undefined;
|
||||
extensions: string[] | undefined;
|
||||
listExtensions: boolean | undefined;
|
||||
ideMode?: boolean | undefined;
|
||||
ideModeFeature: boolean | undefined;
|
||||
proxy: string | undefined;
|
||||
includeDirectories: string[] | undefined;
|
||||
|
@ -224,7 +225,11 @@ export async function parseArguments(): Promise<CliArgs> {
|
|||
});
|
||||
|
||||
yargsInstance.wrap(yargsInstance.terminalWidth());
|
||||
return yargsInstance.argv;
|
||||
const result = yargsInstance.parseSync();
|
||||
|
||||
// The import format is now only controlled by settings.memoryImportFormat
|
||||
// We no longer accept it as a CLI argument
|
||||
return result as CliArgs;
|
||||
}
|
||||
|
||||
// This function is now a thin wrapper around the server's implementation.
|
||||
|
@ -236,11 +241,12 @@ export async function loadHierarchicalGeminiMemory(
|
|||
fileService: FileDiscoveryService,
|
||||
settings: Settings,
|
||||
extensionContextFilePaths: string[] = [],
|
||||
memoryImportFormat: 'flat' | 'tree' = 'tree',
|
||||
fileFilteringOptions?: FileFilteringOptions,
|
||||
): Promise<{ memoryContent: string; fileCount: number }> {
|
||||
if (debugMode) {
|
||||
logger.debug(
|
||||
`CLI: Delegating hierarchical memory load to server for CWD: ${currentWorkingDirectory}`,
|
||||
`CLI: Delegating hierarchical memory load to server for CWD: ${currentWorkingDirectory} (memoryImportFormat: ${memoryImportFormat})`,
|
||||
);
|
||||
}
|
||||
|
||||
|
@ -251,6 +257,7 @@ export async function loadHierarchicalGeminiMemory(
|
|||
debugMode,
|
||||
fileService,
|
||||
extensionContextFilePaths,
|
||||
memoryImportFormat,
|
||||
fileFilteringOptions,
|
||||
settings.memoryDiscoveryMaxDirs,
|
||||
);
|
||||
|
@ -266,9 +273,12 @@ export async function loadCliConfig(
|
|||
argv.debug ||
|
||||
[process.env.DEBUG, process.env.DEBUG_MODE].some(
|
||||
(v) => v === 'true' || v === '1',
|
||||
);
|
||||
|
||||
const ideMode = settings.ideMode ?? false;
|
||||
) ||
|
||||
false;
|
||||
const memoryImportFormat = settings.memoryImportFormat || 'tree';
|
||||
const ideMode =
|
||||
(argv.ideMode ?? settings.ideMode ?? false) &&
|
||||
process.env.TERM_PROGRAM === 'vscode';
|
||||
|
||||
const ideModeFeature =
|
||||
(argv.ideModeFeature ?? settings.ideModeFeature ?? false) &&
|
||||
|
@ -314,6 +324,7 @@ export async function loadCliConfig(
|
|||
fileService,
|
||||
settings,
|
||||
extensionContextFilePaths,
|
||||
memoryImportFormat,
|
||||
fileFiltering,
|
||||
);
|
||||
|
||||
|
|
|
@ -98,6 +98,7 @@ export interface Settings {
|
|||
summarizeToolOutput?: Record<string, SummarizeToolOutputSettings>;
|
||||
|
||||
vimMode?: boolean;
|
||||
memoryImportFormat?: 'tree' | 'flat';
|
||||
|
||||
// Flag to be removed post-launch.
|
||||
ideModeFeature?: boolean;
|
||||
|
|
|
@ -277,6 +277,7 @@ const App = ({ config, settings, startupWarnings = [], version }: AppProps) => {
|
|||
config.getFileService(),
|
||||
settings.merged,
|
||||
config.getExtensionContextFilePaths(),
|
||||
settings.merged.memoryImportFormat || 'tree', // Use setting or default to 'tree'
|
||||
config.getFileFilteringOptions(),
|
||||
);
|
||||
|
||||
|
|
|
@ -92,6 +92,7 @@ export const memoryCommand: SlashCommand = {
|
|||
config.getDebugMode(),
|
||||
config.getFileService(),
|
||||
config.getExtensionContextFilePaths(),
|
||||
context.services.settings.merged.memoryImportFormat || 'tree', // Use setting or default to 'tree'
|
||||
config.getFileFilteringOptions(),
|
||||
context.services.settings.merged.memoryDiscoveryMaxDirs,
|
||||
);
|
||||
|
|
|
@ -39,6 +39,7 @@
|
|||
"html-to-text": "^9.0.5",
|
||||
"https-proxy-agent": "^7.0.6",
|
||||
"ignore": "^7.0.0",
|
||||
"marked": "^15.0.12",
|
||||
"micromatch": "^4.0.8",
|
||||
"open": "^10.1.2",
|
||||
"shell-quote": "^1.8.3",
|
||||
|
|
|
@ -305,10 +305,12 @@ Subdir memory
|
|||
false,
|
||||
new FileDiscoveryService(projectRoot),
|
||||
[],
|
||||
'tree',
|
||||
{
|
||||
respectGitIgnore: true,
|
||||
respectGeminiIgnore: true,
|
||||
},
|
||||
200, // maxDirs parameter
|
||||
);
|
||||
|
||||
expect(result).toEqual({
|
||||
|
@ -334,6 +336,7 @@ My code memory
|
|||
true,
|
||||
new FileDiscoveryService(projectRoot),
|
||||
[],
|
||||
'tree', // importFormat
|
||||
{
|
||||
respectGitIgnore: true,
|
||||
respectGeminiIgnore: true,
|
||||
|
|
|
@ -43,7 +43,7 @@ async function findProjectRoot(startDir: string): Promise<string | null> {
|
|||
while (true) {
|
||||
const gitPath = path.join(currentDir, '.git');
|
||||
try {
|
||||
const stats = await fs.stat(gitPath);
|
||||
const stats = await fs.lstat(gitPath);
|
||||
if (stats.isDirectory()) {
|
||||
return currentDir;
|
||||
}
|
||||
|
@ -230,6 +230,7 @@ async function getGeminiMdFilePathsInternal(
|
|||
async function readGeminiMdFiles(
|
||||
filePaths: string[],
|
||||
debugMode: boolean,
|
||||
importFormat: 'flat' | 'tree' = 'tree',
|
||||
): Promise<GeminiFileContent[]> {
|
||||
const results: GeminiFileContent[] = [];
|
||||
for (const filePath of filePaths) {
|
||||
|
@ -237,16 +238,19 @@ async function readGeminiMdFiles(
|
|||
const content = await fs.readFile(filePath, 'utf-8');
|
||||
|
||||
// Process imports in the content
|
||||
const processedContent = await processImports(
|
||||
const processedResult = await processImports(
|
||||
content,
|
||||
path.dirname(filePath),
|
||||
debugMode,
|
||||
undefined,
|
||||
undefined,
|
||||
importFormat,
|
||||
);
|
||||
|
||||
results.push({ filePath, content: processedContent });
|
||||
results.push({ filePath, content: processedResult.content });
|
||||
if (debugMode)
|
||||
logger.debug(
|
||||
`Successfully read and processed imports: ${filePath} (Length: ${processedContent.length})`,
|
||||
`Successfully read and processed imports: ${filePath} (Length: ${processedResult.content.length})`,
|
||||
);
|
||||
} catch (error: unknown) {
|
||||
const isTestEnv = process.env.NODE_ENV === 'test' || process.env.VITEST;
|
||||
|
@ -293,12 +297,13 @@ export async function loadServerHierarchicalMemory(
|
|||
debugMode: boolean,
|
||||
fileService: FileDiscoveryService,
|
||||
extensionContextFilePaths: string[] = [],
|
||||
importFormat: 'flat' | 'tree' = 'tree',
|
||||
fileFilteringOptions?: FileFilteringOptions,
|
||||
maxDirs: number = 200,
|
||||
): Promise<{ memoryContent: string; fileCount: number }> {
|
||||
if (debugMode)
|
||||
logger.debug(
|
||||
`Loading server hierarchical memory for CWD: ${currentWorkingDirectory}`,
|
||||
`Loading server hierarchical memory for CWD: ${currentWorkingDirectory} (importFormat: ${importFormat})`,
|
||||
);
|
||||
|
||||
// For the server, homedir() refers to the server process's home.
|
||||
|
@ -317,7 +322,11 @@ export async function loadServerHierarchicalMemory(
|
|||
if (debugMode) logger.debug('No GEMINI.md files found in hierarchy.');
|
||||
return { memoryContent: '', fileCount: 0 };
|
||||
}
|
||||
const contentsWithPaths = await readGeminiMdFiles(filePaths, debugMode);
|
||||
const contentsWithPaths = await readGeminiMdFiles(
|
||||
filePaths,
|
||||
debugMode,
|
||||
importFormat,
|
||||
);
|
||||
// Pass CWD for relative path display in concatenated content
|
||||
const combinedInstructions = concatenateInstructions(
|
||||
contentsWithPaths,
|
||||
|
|
File diff suppressed because it is too large
Load Diff
|
@ -6,6 +6,7 @@
|
|||
|
||||
import * as fs from 'fs/promises';
|
||||
import * as path from 'path';
|
||||
import { marked } from 'marked';
|
||||
|
||||
// Simple console logger for import processing
|
||||
const logger = {
|
||||
|
@ -29,15 +30,176 @@ interface ImportState {
|
|||
currentFile?: string; // Track the current file being processed
|
||||
}
|
||||
|
||||
/**
|
||||
* Interface representing a file in the import tree
|
||||
*/
|
||||
export interface MemoryFile {
|
||||
path: string;
|
||||
imports?: MemoryFile[]; // Direct imports, in the order they were imported
|
||||
}
|
||||
|
||||
/**
|
||||
* Result of processing imports
|
||||
*/
|
||||
export interface ProcessImportsResult {
|
||||
content: string;
|
||||
importTree: MemoryFile;
|
||||
}
|
||||
|
||||
// Helper to find the project root (looks for .git directory)
|
||||
async function findProjectRoot(startDir: string): Promise<string> {
|
||||
let currentDir = path.resolve(startDir);
|
||||
while (true) {
|
||||
const gitPath = path.join(currentDir, '.git');
|
||||
try {
|
||||
const stats = await fs.lstat(gitPath);
|
||||
if (stats.isDirectory()) {
|
||||
return currentDir;
|
||||
}
|
||||
} catch {
|
||||
// .git not found, continue to parent
|
||||
}
|
||||
const parentDir = path.dirname(currentDir);
|
||||
if (parentDir === currentDir) {
|
||||
// Reached filesystem root
|
||||
break;
|
||||
}
|
||||
currentDir = parentDir;
|
||||
}
|
||||
// Fallback to startDir if .git not found
|
||||
return path.resolve(startDir);
|
||||
}
|
||||
|
||||
// Add a type guard for error objects
|
||||
function hasMessage(err: unknown): err is { message: string } {
|
||||
return (
|
||||
typeof err === 'object' &&
|
||||
err !== null &&
|
||||
'message' in err &&
|
||||
typeof (err as { message: unknown }).message === 'string'
|
||||
);
|
||||
}
|
||||
|
||||
// Helper to find all code block and inline code regions using marked
|
||||
/**
|
||||
* Finds all import statements in content without using regex
|
||||
* @returns Array of {start, _end, path} objects for each import found
|
||||
*/
|
||||
function findImports(
|
||||
content: string,
|
||||
): Array<{ start: number; _end: number; path: string }> {
|
||||
const imports: Array<{ start: number; _end: number; path: string }> = [];
|
||||
let i = 0;
|
||||
const len = content.length;
|
||||
|
||||
while (i < len) {
|
||||
// Find next @ symbol
|
||||
i = content.indexOf('@', i);
|
||||
if (i === -1) break;
|
||||
|
||||
// Check if it's a word boundary (not part of another word)
|
||||
if (i > 0 && !isWhitespace(content[i - 1])) {
|
||||
i++;
|
||||
continue;
|
||||
}
|
||||
|
||||
// Find the end of the import path (whitespace or newline)
|
||||
let j = i + 1;
|
||||
while (
|
||||
j < len &&
|
||||
!isWhitespace(content[j]) &&
|
||||
content[j] !== '\n' &&
|
||||
content[j] !== '\r'
|
||||
) {
|
||||
j++;
|
||||
}
|
||||
|
||||
// Extract the path (everything after @)
|
||||
const importPath = content.slice(i + 1, j);
|
||||
|
||||
// Basic validation (starts with ./ or / or letter)
|
||||
if (
|
||||
importPath.length > 0 &&
|
||||
(importPath[0] === '.' ||
|
||||
importPath[0] === '/' ||
|
||||
isLetter(importPath[0]))
|
||||
) {
|
||||
imports.push({
|
||||
start: i,
|
||||
_end: j,
|
||||
path: importPath,
|
||||
});
|
||||
}
|
||||
|
||||
i = j + 1;
|
||||
}
|
||||
|
||||
return imports;
|
||||
}
|
||||
|
||||
function isWhitespace(char: string): boolean {
|
||||
return char === ' ' || char === '\t' || char === '\n' || char === '\r';
|
||||
}
|
||||
|
||||
function isLetter(char: string): boolean {
|
||||
const code = char.charCodeAt(0);
|
||||
return (
|
||||
(code >= 65 && code <= 90) || // A-Z
|
||||
(code >= 97 && code <= 122)
|
||||
); // a-z
|
||||
}
|
||||
|
||||
function findCodeRegions(content: string): Array<[number, number]> {
|
||||
const regions: Array<[number, number]> = [];
|
||||
const tokens = marked.lexer(content);
|
||||
|
||||
// Map from raw content to a queue of its start indices in the original content.
|
||||
const rawContentIndices = new Map<string, number[]>();
|
||||
|
||||
function walk(token: { type: string; raw: string; tokens?: unknown[] }) {
|
||||
if (token.type === 'code' || token.type === 'codespan') {
|
||||
if (!rawContentIndices.has(token.raw)) {
|
||||
const indices: number[] = [];
|
||||
let lastIndex = -1;
|
||||
while ((lastIndex = content.indexOf(token.raw, lastIndex + 1)) !== -1) {
|
||||
indices.push(lastIndex);
|
||||
}
|
||||
rawContentIndices.set(token.raw, indices);
|
||||
}
|
||||
|
||||
const indices = rawContentIndices.get(token.raw);
|
||||
if (indices && indices.length > 0) {
|
||||
// Assume tokens are processed in order of appearance.
|
||||
// Dequeue the next available index for this raw content.
|
||||
const idx = indices.shift()!;
|
||||
regions.push([idx, idx + token.raw.length]);
|
||||
}
|
||||
}
|
||||
|
||||
if ('tokens' in token && token.tokens) {
|
||||
for (const child of token.tokens) {
|
||||
walk(child as { type: string; raw: string; tokens?: unknown[] });
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
for (const token of tokens) {
|
||||
walk(token);
|
||||
}
|
||||
|
||||
return regions;
|
||||
}
|
||||
|
||||
/**
|
||||
* Processes import statements in GEMINI.md content
|
||||
* Supports @path/to/file.md syntax for importing content from other files
|
||||
*
|
||||
* Supports @path/to/file syntax for importing content from other files
|
||||
* @param content - The content to process for imports
|
||||
* @param basePath - The directory path where the current file is located
|
||||
* @param debugMode - Whether to enable debug logging
|
||||
* @param importState - State tracking for circular import prevention
|
||||
* @returns Processed content with imports resolved
|
||||
* @param projectRoot - The project root directory for allowed directories
|
||||
* @param importFormat - The format of the import tree
|
||||
* @returns Processed content with imports resolved and import tree
|
||||
*/
|
||||
export async function processImports(
|
||||
content: string,
|
||||
|
@ -45,156 +207,198 @@ export async function processImports(
|
|||
debugMode: boolean = false,
|
||||
importState: ImportState = {
|
||||
processedFiles: new Set(),
|
||||
maxDepth: 10,
|
||||
maxDepth: 5,
|
||||
currentDepth: 0,
|
||||
},
|
||||
): Promise<string> {
|
||||
projectRoot?: string,
|
||||
importFormat: 'flat' | 'tree' = 'tree',
|
||||
): Promise<ProcessImportsResult> {
|
||||
if (!projectRoot) {
|
||||
projectRoot = await findProjectRoot(basePath);
|
||||
}
|
||||
|
||||
if (importState.currentDepth >= importState.maxDepth) {
|
||||
if (debugMode) {
|
||||
logger.warn(
|
||||
`Maximum import depth (${importState.maxDepth}) reached. Stopping import processing.`,
|
||||
);
|
||||
}
|
||||
return content;
|
||||
return {
|
||||
content,
|
||||
importTree: { path: importState.currentFile || 'unknown' },
|
||||
};
|
||||
}
|
||||
|
||||
// Regex to match @path/to/file imports (supports any file extension)
|
||||
// Supports both @path/to/file.md and @./path/to/file.md syntax
|
||||
const importRegex = /@([./]?[^\s\n]+\.[^\s\n]+)/g;
|
||||
// --- FLAT FORMAT LOGIC ---
|
||||
if (importFormat === 'flat') {
|
||||
// Use a queue to process files in order of first encounter, and a set to avoid duplicates
|
||||
const flatFiles: Array<{ path: string; content: string }> = [];
|
||||
// Track processed files across the entire operation
|
||||
const processedFiles = new Set<string>();
|
||||
|
||||
let processedContent = content;
|
||||
let match: RegExpExecArray | null;
|
||||
// Helper to recursively process imports
|
||||
async function processFlat(
|
||||
fileContent: string,
|
||||
fileBasePath: string,
|
||||
filePath: string,
|
||||
depth: number,
|
||||
) {
|
||||
// Normalize the file path to ensure consistent comparison
|
||||
const normalizedPath = path.normalize(filePath);
|
||||
|
||||
// Process all imports in the content
|
||||
while ((match = importRegex.exec(content)) !== null) {
|
||||
const importPath = match[1];
|
||||
// Skip if already processed
|
||||
if (processedFiles.has(normalizedPath)) return;
|
||||
|
||||
// Validate import path to prevent path traversal attacks
|
||||
if (!validateImportPath(importPath, basePath, [basePath])) {
|
||||
processedContent = processedContent.replace(
|
||||
match[0],
|
||||
`<!-- Import failed: ${importPath} - Path traversal attempt -->`,
|
||||
);
|
||||
continue;
|
||||
}
|
||||
// Mark as processed before processing to prevent infinite recursion
|
||||
processedFiles.add(normalizedPath);
|
||||
|
||||
// Check if the import is for a non-md file and warn
|
||||
if (!importPath.endsWith('.md')) {
|
||||
logger.warn(
|
||||
`Import processor only supports .md files. Attempting to import non-md file: ${importPath}. This will fail.`,
|
||||
);
|
||||
// Replace the import with a warning comment
|
||||
processedContent = processedContent.replace(
|
||||
match[0],
|
||||
`<!-- Import failed: ${importPath} - Only .md files are supported -->`,
|
||||
);
|
||||
continue;
|
||||
}
|
||||
// Add this file to the flat list
|
||||
flatFiles.push({ path: normalizedPath, content: fileContent });
|
||||
|
||||
const fullPath = path.resolve(basePath, importPath);
|
||||
// Find imports in this file
|
||||
const codeRegions = findCodeRegions(fileContent);
|
||||
const imports = findImports(fileContent);
|
||||
|
||||
if (debugMode) {
|
||||
logger.debug(`Processing import: ${importPath} -> ${fullPath}`);
|
||||
}
|
||||
// Process imports in reverse order to handle indices correctly
|
||||
for (let i = imports.length - 1; i >= 0; i--) {
|
||||
const { start, _end, path: importPath } = imports[i];
|
||||
|
||||
// Check for circular imports - if we're already processing this file
|
||||
if (importState.currentFile === fullPath) {
|
||||
if (debugMode) {
|
||||
logger.warn(`Circular import detected: ${importPath}`);
|
||||
}
|
||||
// Replace the import with a warning comment
|
||||
processedContent = processedContent.replace(
|
||||
match[0],
|
||||
`<!-- Circular import detected: ${importPath} -->`,
|
||||
);
|
||||
continue;
|
||||
}
|
||||
|
||||
// Check if we've already processed this file in this import chain
|
||||
if (importState.processedFiles.has(fullPath)) {
|
||||
if (debugMode) {
|
||||
logger.warn(`File already processed in this chain: ${importPath}`);
|
||||
}
|
||||
// Replace the import with a warning comment
|
||||
processedContent = processedContent.replace(
|
||||
match[0],
|
||||
`<!-- File already processed: ${importPath} -->`,
|
||||
);
|
||||
continue;
|
||||
}
|
||||
|
||||
// Check for potential circular imports by looking at the import chain
|
||||
if (importState.currentFile) {
|
||||
const currentFileDir = path.dirname(importState.currentFile);
|
||||
const potentialCircularPath = path.resolve(currentFileDir, importPath);
|
||||
if (potentialCircularPath === importState.currentFile) {
|
||||
if (debugMode) {
|
||||
logger.warn(`Circular import detected: ${importPath}`);
|
||||
// Skip if inside a code region
|
||||
if (
|
||||
codeRegions.some(
|
||||
([regionStart, regionEnd]) =>
|
||||
start >= regionStart && start < regionEnd,
|
||||
)
|
||||
) {
|
||||
continue;
|
||||
}
|
||||
|
||||
// Validate import path
|
||||
if (
|
||||
!validateImportPath(importPath, fileBasePath, [projectRoot || ''])
|
||||
) {
|
||||
continue;
|
||||
}
|
||||
|
||||
const fullPath = path.resolve(fileBasePath, importPath);
|
||||
const normalizedFullPath = path.normalize(fullPath);
|
||||
|
||||
// Skip if already processed
|
||||
if (processedFiles.has(normalizedFullPath)) continue;
|
||||
|
||||
try {
|
||||
await fs.access(fullPath);
|
||||
const importedContent = await fs.readFile(fullPath, 'utf-8');
|
||||
|
||||
// Process the imported file
|
||||
await processFlat(
|
||||
importedContent,
|
||||
path.dirname(fullPath),
|
||||
normalizedFullPath,
|
||||
depth + 1,
|
||||
);
|
||||
} catch (error) {
|
||||
if (debugMode) {
|
||||
logger.warn(
|
||||
`Failed to import ${fullPath}: ${hasMessage(error) ? error.message : 'Unknown error'}`,
|
||||
);
|
||||
}
|
||||
// Continue with other imports even if one fails
|
||||
}
|
||||
// Replace the import with a warning comment
|
||||
processedContent = processedContent.replace(
|
||||
match[0],
|
||||
`<!-- Circular import detected: ${importPath} -->`,
|
||||
);
|
||||
continue;
|
||||
}
|
||||
}
|
||||
|
||||
// Start with the root file (current file)
|
||||
const rootPath = path.normalize(
|
||||
importState.currentFile || path.resolve(basePath),
|
||||
);
|
||||
await processFlat(content, basePath, rootPath, 0);
|
||||
|
||||
// Concatenate all unique files in order, Claude-style
|
||||
const flatContent = flatFiles
|
||||
.map(
|
||||
(f) =>
|
||||
`--- File: ${f.path} ---\n${f.content.trim()}\n--- End of File: ${f.path} ---`,
|
||||
)
|
||||
.join('\n\n');
|
||||
|
||||
return {
|
||||
content: flatContent,
|
||||
importTree: { path: rootPath }, // Tree not meaningful in flat mode
|
||||
};
|
||||
}
|
||||
|
||||
// --- TREE FORMAT LOGIC (existing) ---
|
||||
const codeRegions = findCodeRegions(content);
|
||||
let result = '';
|
||||
let lastIndex = 0;
|
||||
const imports: MemoryFile[] = [];
|
||||
const importsList = findImports(content);
|
||||
|
||||
for (const { start, _end, path: importPath } of importsList) {
|
||||
// Add content before this import
|
||||
result += content.substring(lastIndex, start);
|
||||
lastIndex = _end;
|
||||
|
||||
// Skip if inside a code region
|
||||
if (codeRegions.some(([s, e]) => start >= s && start < e)) {
|
||||
result += `@${importPath}`;
|
||||
continue;
|
||||
}
|
||||
// Validate import path to prevent path traversal attacks
|
||||
if (!validateImportPath(importPath, basePath, [projectRoot || ''])) {
|
||||
result += `<!-- Import failed: ${importPath} - Path traversal attempt -->`;
|
||||
continue;
|
||||
}
|
||||
const fullPath = path.resolve(basePath, importPath);
|
||||
if (importState.processedFiles.has(fullPath)) {
|
||||
result += `<!-- File already processed: ${importPath} -->`;
|
||||
continue;
|
||||
}
|
||||
try {
|
||||
// Check if the file exists
|
||||
await fs.access(fullPath);
|
||||
|
||||
// Read the imported file content
|
||||
const importedContent = await fs.readFile(fullPath, 'utf-8');
|
||||
|
||||
if (debugMode) {
|
||||
logger.debug(`Successfully read imported file: ${fullPath}`);
|
||||
}
|
||||
|
||||
// Recursively process imports in the imported content
|
||||
const processedImportedContent = await processImports(
|
||||
importedContent,
|
||||
const fileContent = await fs.readFile(fullPath, 'utf-8');
|
||||
// Mark this file as processed for this import chain
|
||||
const newImportState: ImportState = {
|
||||
...importState,
|
||||
processedFiles: new Set(importState.processedFiles),
|
||||
currentDepth: importState.currentDepth + 1,
|
||||
currentFile: fullPath,
|
||||
};
|
||||
newImportState.processedFiles.add(fullPath);
|
||||
const imported = await processImports(
|
||||
fileContent,
|
||||
path.dirname(fullPath),
|
||||
debugMode,
|
||||
{
|
||||
...importState,
|
||||
processedFiles: new Set([...importState.processedFiles, fullPath]),
|
||||
currentDepth: importState.currentDepth + 1,
|
||||
currentFile: fullPath, // Set the current file being processed
|
||||
},
|
||||
newImportState,
|
||||
projectRoot,
|
||||
importFormat,
|
||||
);
|
||||
|
||||
// Replace the import statement with the processed content
|
||||
processedContent = processedContent.replace(
|
||||
match[0],
|
||||
`<!-- Imported from: ${importPath} -->\n${processedImportedContent}\n<!-- End of import from: ${importPath} -->`,
|
||||
);
|
||||
} catch (error) {
|
||||
const errorMessage =
|
||||
error instanceof Error ? error.message : String(error);
|
||||
if (debugMode) {
|
||||
logger.error(`Failed to import ${importPath}: ${errorMessage}`);
|
||||
result += `<!-- Imported from: ${importPath} -->\n${imported.content}\n<!-- End of import from: ${importPath} -->`;
|
||||
imports.push(imported.importTree);
|
||||
} catch (err: unknown) {
|
||||
let message = 'Unknown error';
|
||||
if (hasMessage(err)) {
|
||||
message = err.message;
|
||||
} else if (typeof err === 'string') {
|
||||
message = err;
|
||||
}
|
||||
|
||||
// Replace the import with an error comment
|
||||
processedContent = processedContent.replace(
|
||||
match[0],
|
||||
`<!-- Import failed: ${importPath} - ${errorMessage} -->`,
|
||||
);
|
||||
logger.error(`Failed to import ${importPath}: ${message}`);
|
||||
result += `<!-- Import failed: ${importPath} - ${message} -->`;
|
||||
}
|
||||
}
|
||||
// Add any remaining content after the last match
|
||||
result += content.substring(lastIndex);
|
||||
|
||||
return processedContent;
|
||||
return {
|
||||
content: result,
|
||||
importTree: {
|
||||
path: importState.currentFile || 'unknown',
|
||||
imports: imports.length > 0 ? imports : undefined,
|
||||
},
|
||||
};
|
||||
}
|
||||
|
||||
/**
|
||||
* Validates import paths to ensure they are safe and within allowed directories
|
||||
*
|
||||
* @param importPath - The import path to validate
|
||||
* @param basePath - The base directory for resolving relative paths
|
||||
* @param allowedDirectories - Array of allowed directory paths
|
||||
* @returns Whether the import path is valid
|
||||
*/
|
||||
export function validateImportPath(
|
||||
importPath: string,
|
||||
basePath: string,
|
||||
|
@ -209,6 +413,8 @@ export function validateImportPath(
|
|||
|
||||
return allowedDirectories.some((allowedDir) => {
|
||||
const normalizedAllowedDir = path.resolve(allowedDir);
|
||||
return resolvedPath.startsWith(normalizedAllowedDir);
|
||||
const isSamePath = resolvedPath === normalizedAllowedDir;
|
||||
const isSubPath = resolvedPath.startsWith(normalizedAllowedDir + path.sep);
|
||||
return isSamePath || isSubPath;
|
||||
});
|
||||
}
|
||||
|
|
|
@ -0,0 +1,51 @@
|
|||
/**
|
||||
* @license
|
||||
* Copyright 2025 Google LLC
|
||||
* SPDX-License-Identifier: Apache-2.0
|
||||
*/
|
||||
|
||||
import path from 'path';
|
||||
import { fileURLToPath } from 'url';
|
||||
|
||||
// Test how paths are normalized
|
||||
function testPathNormalization() {
|
||||
// Use platform-agnostic path construction instead of hardcoded paths
|
||||
const testPath = path.join('test', 'project', 'src', 'file.md');
|
||||
const absoluteTestPath = path.resolve('test', 'project', 'src', 'file.md');
|
||||
|
||||
console.log('Testing path normalization:');
|
||||
console.log('Relative path:', testPath);
|
||||
console.log('Absolute path:', absoluteTestPath);
|
||||
|
||||
// Test path.join with different segments
|
||||
const joinedPath = path.join('test', 'project', 'src', 'file.md');
|
||||
console.log('Joined path:', joinedPath);
|
||||
|
||||
// Test path.normalize
|
||||
console.log('Normalized relative path:', path.normalize(testPath));
|
||||
console.log('Normalized absolute path:', path.normalize(absoluteTestPath));
|
||||
|
||||
// Test how the test would see these paths
|
||||
const testContent = `--- File: ${absoluteTestPath} ---\nContent\n--- End of File: ${absoluteTestPath} ---`;
|
||||
console.log('\nTest content with platform-agnostic paths:');
|
||||
console.log(testContent);
|
||||
|
||||
// Try to match with different patterns
|
||||
const marker = `--- File: ${absoluteTestPath} ---`;
|
||||
console.log('\nTrying to match:', marker);
|
||||
console.log('Direct match:', testContent.includes(marker));
|
||||
|
||||
// Test with normalized path in marker
|
||||
const normalizedMarker = `--- File: ${path.normalize(absoluteTestPath)} ---`;
|
||||
console.log(
|
||||
'Normalized marker match:',
|
||||
testContent.includes(normalizedMarker),
|
||||
);
|
||||
|
||||
// Test path resolution
|
||||
const __filename = fileURLToPath(import.meta.url);
|
||||
console.log('\nCurrent file path:', __filename);
|
||||
console.log('Directory name:', path.dirname(__filename));
|
||||
}
|
||||
|
||||
testPathNormalization();
|
Loading…
Reference in New Issue