feat: modular GEMINI.md imports with @file.md syntax (#1585) (#2230)

This commit is contained in:
Niladri Das 2025-06-30 04:21:47 +05:30 committed by GitHub
parent ada4061a45
commit f848d35758
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
5 changed files with 658 additions and 2 deletions

View File

@ -5,6 +5,7 @@ Gemini CLI's core package (`packages/core`) is the backend portion of Gemini CLI
## Navigating this section
- **[Core tools API](./tools-api.md):** Information on how tools are defined, registered, and used by the core.
- **[Memory Import Processor](./memport.md):** Documentation for the modular GEMINI.md import feature using @file.md syntax.
## Role of the core

175
docs/core/memport.md Normal file
View File

@ -0,0 +1,175 @@
# Memory Import Processor
The Memory Import Processor is a feature that allows you to modularize your GEMINI.md files by importing content from other markdown files using the `@file.md` syntax.
## Overview
This feature enables you to break down large GEMINI.md files into smaller, more manageable components that can be reused across different contexts. The import processor supports both relative and absolute paths, with built-in safety features to prevent circular imports and ensure file access security.
## Important Limitations
**This feature only supports `.md` (markdown) files.** Attempting to import files with other extensions (like `.txt`, `.json`, etc.) will result in a warning and the import will fail.
## Syntax
Use the `@` symbol followed by the path to the markdown file you want to import:
```markdown
# Main GEMINI.md file
This is the main content.
@./components/instructions.md
More content here.
@./shared/configuration.md
```
## Supported Path Formats
### Relative Paths
- `@./file.md` - Import from the same directory
- `@../file.md` - Import from parent directory
- `@./components/file.md` - Import from subdirectory
### Absolute Paths
- `@/absolute/path/to/file.md` - Import using absolute path
## Examples
### Basic Import
```markdown
# My GEMINI.md
Welcome to my project!
@./getting-started.md
## Features
@./features/overview.md
```
### Nested Imports
The imported files can themselves contain imports, creating a nested structure:
```markdown
# main.md
@./header.md
@./content.md
@./footer.md
```
```markdown
# header.md
# Project Header
@./shared/title.md
```
## Safety Features
### Circular Import Detection
The processor automatically detects and prevents circular imports:
```markdown
# file-a.md
@./file-b.md
# file-b.md
@./file-a.md <!-- This will be detected and prevented -->
```
### File Access Security
The `validateImportPath` function ensures that imports are only allowed from specified directories, preventing access to sensitive files outside the allowed scope.
### Maximum Import Depth
To prevent infinite recursion, there's a configurable maximum import depth (default: 10 levels).
## Error Handling
### Non-MD File Attempts
If you try to import a non-markdown file, you'll see a warning:
```markdown
@./instructions.txt <!-- This will show a warning and fail -->
```
Console output:
```
[WARN] [ImportProcessor] Import processor only supports .md files. Attempting to import non-md file: ./instructions.txt. This will fail.
```
### Missing Files
If a referenced file doesn't exist, the import will fail gracefully with an error comment in the output.
### File Access Errors
Permission issues or other file system errors are handled gracefully with appropriate error messages.
## API Reference
### `processImports(content, basePath, debugMode?, importState?)`
Processes import statements in GEMINI.md content.
**Parameters:**
- `content` (string): The content to process for imports
- `basePath` (string): The directory path where the current file is located
- `debugMode` (boolean, optional): Whether to enable debug logging (default: false)
- `importState` (ImportState, optional): State tracking for circular import prevention
**Returns:** Promise<string> - Processed content with imports resolved
### `validateImportPath(importPath, basePath, allowedDirectories)`
Validates import paths to ensure they are safe and within allowed directories.
**Parameters:**
- `importPath` (string): The import path to validate
- `basePath` (string): The base directory for resolving relative paths
- `allowedDirectories` (string[]): Array of allowed directory paths
**Returns:** boolean - Whether the import path is valid
## Best Practices
1. **Use descriptive file names** for imported components
2. **Keep imports shallow** - avoid deeply nested import chains
3. **Document your structure** - maintain a clear hierarchy of imported files
4. **Test your imports** - ensure all referenced files exist and are accessible
5. **Use relative paths** when possible for better portability
## Troubleshooting
### Common Issues
1. **Import not working**: Check that the file exists and has a `.md` extension
2. **Circular import warnings**: Review your import structure for circular references
3. **Permission errors**: Ensure the files are readable and within allowed directories
4. **Path resolution issues**: Use absolute paths if relative paths aren't resolving correctly
### Debug Mode
Enable debug mode to see detailed logging of the import process:
```typescript
const result = await processImports(content, basePath, true);
```

View File

@ -14,6 +14,7 @@ import {
getAllGeminiMdFilenames,
} from '../tools/memoryTool.js';
import { FileDiscoveryService } from '../services/fileDiscoveryService.js';
import { processImports } from './memoryImportProcessor.js';
// Simple console logger, similar to the one previously in CLI's config.ts
// TODO: Integrate with a more robust server-side logger if available/appropriate.
@ -223,10 +224,18 @@ async function readGeminiMdFiles(
for (const filePath of filePaths) {
try {
const content = await fs.readFile(filePath, 'utf-8');
results.push({ filePath, content });
// Process imports in the content
const processedContent = await processImports(
content,
path.dirname(filePath),
debugMode,
);
results.push({ filePath, content: processedContent });
if (debugMode)
logger.debug(
`Successfully read: ${filePath} (Length: ${content.length})`,
`Successfully read and processed imports: ${filePath} (Length: ${processedContent.length})`,
);
} catch (error: unknown) {
const isTestEnv = process.env.NODE_ENV === 'test' || process.env.VITEST;

View File

@ -0,0 +1,257 @@
/**
* @license
* Copyright 2025 Google LLC
* SPDX-License-Identifier: Apache-2.0
*/
import { describe, it, expect, vi, beforeEach, afterEach } from 'vitest';
import * as fs from 'fs/promises';
import * as path from 'path';
import { processImports, validateImportPath } from './memoryImportProcessor.js';
// Mock fs/promises
vi.mock('fs/promises');
const mockedFs = vi.mocked(fs);
// Mock console methods to capture warnings
const originalConsoleWarn = console.warn;
const originalConsoleError = console.error;
const originalConsoleDebug = console.debug;
describe('memoryImportProcessor', () => {
beforeEach(() => {
vi.clearAllMocks();
// Mock console methods
console.warn = vi.fn();
console.error = vi.fn();
console.debug = vi.fn();
});
afterEach(() => {
// Restore console methods
console.warn = originalConsoleWarn;
console.error = originalConsoleError;
console.debug = originalConsoleDebug;
});
describe('processImports', () => {
it('should process basic md file imports', async () => {
const content = 'Some content @./test.md more content';
const basePath = '/test/path';
const importedContent = '# Imported Content\nThis is imported.';
mockedFs.access.mockResolvedValue(undefined);
mockedFs.readFile.mockResolvedValue(importedContent);
const result = await processImports(content, basePath, true);
expect(result).toContain('<!-- Imported from: ./test.md -->');
expect(result).toContain(importedContent);
expect(result).toContain('<!-- End of import from: ./test.md -->');
expect(mockedFs.readFile).toHaveBeenCalledWith(
path.resolve(basePath, './test.md'),
'utf-8',
);
});
it('should warn and fail for non-md file imports', async () => {
const content = 'Some content @./instructions.txt more content';
const basePath = '/test/path';
const result = await processImports(content, basePath, true);
expect(console.warn).toHaveBeenCalledWith(
'[WARN] [ImportProcessor]',
'Import processor only supports .md files. Attempting to import non-md file: ./instructions.txt. This will fail.',
);
expect(result).toContain(
'<!-- Import failed: ./instructions.txt - Only .md files are supported -->',
);
expect(mockedFs.readFile).not.toHaveBeenCalled();
});
it('should handle circular imports', async () => {
const content = 'Content @./circular.md more content';
const basePath = '/test/path';
const circularContent = 'Circular @./main.md content';
mockedFs.access.mockResolvedValue(undefined);
mockedFs.readFile.mockResolvedValue(circularContent);
// Set up the import state to simulate we're already processing main.md
const importState = {
processedFiles: new Set<string>(),
maxDepth: 10,
currentDepth: 0,
currentFile: '/test/path/main.md', // Simulate we're processing main.md
};
const result = await processImports(content, basePath, true, importState);
// The circular import should be detected when processing the nested import
expect(result).toContain('<!-- Circular import detected: ./main.md -->');
});
it('should handle file not found errors', async () => {
const content = 'Content @./nonexistent.md more content';
const basePath = '/test/path';
mockedFs.access.mockRejectedValue(new Error('File not found'));
const result = await processImports(content, basePath, true);
expect(result).toContain(
'<!-- Import failed: ./nonexistent.md - File not found -->',
);
expect(console.error).toHaveBeenCalledWith(
'[ERROR] [ImportProcessor]',
'Failed to import ./nonexistent.md: File not found',
);
});
it('should respect max depth limit', async () => {
const content = 'Content @./deep.md more content';
const basePath = '/test/path';
const deepContent = 'Deep @./deeper.md content';
mockedFs.access.mockResolvedValue(undefined);
mockedFs.readFile.mockResolvedValue(deepContent);
const importState = {
processedFiles: new Set<string>(),
maxDepth: 1,
currentDepth: 1,
};
const result = await processImports(content, basePath, true, importState);
expect(console.warn).toHaveBeenCalledWith(
'[WARN] [ImportProcessor]',
'Maximum import depth (1) reached. Stopping import processing.',
);
expect(result).toBe(content);
});
it('should handle nested imports recursively', async () => {
const content = 'Main @./nested.md content';
const basePath = '/test/path';
const nestedContent = 'Nested @./inner.md content';
const innerContent = 'Inner content';
mockedFs.access.mockResolvedValue(undefined);
mockedFs.readFile
.mockResolvedValueOnce(nestedContent)
.mockResolvedValueOnce(innerContent);
const result = await processImports(content, basePath, true);
expect(result).toContain('<!-- Imported from: ./nested.md -->');
expect(result).toContain('<!-- Imported from: ./inner.md -->');
expect(result).toContain(innerContent);
});
it('should handle absolute paths in imports', async () => {
const content = 'Content @/absolute/path/file.md more content';
const basePath = '/test/path';
const importedContent = 'Absolute path content';
mockedFs.access.mockResolvedValue(undefined);
mockedFs.readFile.mockResolvedValue(importedContent);
const result = await processImports(content, basePath, true);
expect(result).toContain(
'<!-- Import failed: /absolute/path/file.md - Path traversal attempt -->',
);
});
it('should handle multiple imports in same content', async () => {
const content = 'Start @./first.md middle @./second.md end';
const basePath = '/test/path';
const firstContent = 'First content';
const secondContent = 'Second content';
mockedFs.access.mockResolvedValue(undefined);
mockedFs.readFile
.mockResolvedValueOnce(firstContent)
.mockResolvedValueOnce(secondContent);
const result = await processImports(content, basePath, true);
expect(result).toContain('<!-- Imported from: ./first.md -->');
expect(result).toContain('<!-- Imported from: ./second.md -->');
expect(result).toContain(firstContent);
expect(result).toContain(secondContent);
});
});
describe('validateImportPath', () => {
it('should reject URLs', () => {
expect(
validateImportPath('https://example.com/file.md', '/base', [
'/allowed',
]),
).toBe(false);
expect(
validateImportPath('http://example.com/file.md', '/base', ['/allowed']),
).toBe(false);
expect(
validateImportPath('file:///path/to/file.md', '/base', ['/allowed']),
).toBe(false);
});
it('should allow paths within allowed directories', () => {
expect(validateImportPath('./file.md', '/base', ['/base'])).toBe(true);
expect(validateImportPath('../file.md', '/base', ['/allowed'])).toBe(
false,
);
expect(
validateImportPath('/allowed/sub/file.md', '/base', ['/allowed']),
).toBe(true);
});
it('should reject paths outside allowed directories', () => {
expect(
validateImportPath('/forbidden/file.md', '/base', ['/allowed']),
).toBe(false);
expect(validateImportPath('../../../file.md', '/base', ['/base'])).toBe(
false,
);
});
it('should handle multiple allowed directories', () => {
expect(
validateImportPath('./file.md', '/base', ['/allowed1', '/allowed2']),
).toBe(false);
expect(
validateImportPath('/allowed1/file.md', '/base', [
'/allowed1',
'/allowed2',
]),
).toBe(true);
expect(
validateImportPath('/allowed2/file.md', '/base', [
'/allowed1',
'/allowed2',
]),
).toBe(true);
});
it('should handle relative paths correctly', () => {
expect(validateImportPath('file.md', '/base', ['/base'])).toBe(true);
expect(validateImportPath('./file.md', '/base', ['/base'])).toBe(true);
expect(validateImportPath('../file.md', '/base', ['/parent'])).toBe(
false,
);
});
it('should handle absolute paths correctly', () => {
expect(
validateImportPath('/allowed/file.md', '/base', ['/allowed']),
).toBe(true);
expect(
validateImportPath('/forbidden/file.md', '/base', ['/allowed']),
).toBe(false);
});
});
});

View File

@ -0,0 +1,214 @@
/**
* @license
* Copyright 2025 Google LLC
* SPDX-License-Identifier: Apache-2.0
*/
import * as fs from 'fs/promises';
import * as path from 'path';
// Simple console logger for import processing
const logger = {
// eslint-disable-next-line @typescript-eslint/no-explicit-any
debug: (...args: any[]) =>
console.debug('[DEBUG] [ImportProcessor]', ...args),
// eslint-disable-next-line @typescript-eslint/no-explicit-any
warn: (...args: any[]) => console.warn('[WARN] [ImportProcessor]', ...args),
// eslint-disable-next-line @typescript-eslint/no-explicit-any
error: (...args: any[]) =>
console.error('[ERROR] [ImportProcessor]', ...args),
};
/**
* Interface for tracking import processing state to prevent circular imports
*/
interface ImportState {
processedFiles: Set<string>;
maxDepth: number;
currentDepth: number;
currentFile?: string; // Track the current file being processed
}
/**
* Processes import statements in GEMINI.md content
* Supports @path/to/file.md syntax for importing content from other files
*
* @param content - The content to process for imports
* @param basePath - The directory path where the current file is located
* @param debugMode - Whether to enable debug logging
* @param importState - State tracking for circular import prevention
* @returns Processed content with imports resolved
*/
export async function processImports(
content: string,
basePath: string,
debugMode: boolean = false,
importState: ImportState = {
processedFiles: new Set(),
maxDepth: 10,
currentDepth: 0,
},
): Promise<string> {
if (importState.currentDepth >= importState.maxDepth) {
if (debugMode) {
logger.warn(
`Maximum import depth (${importState.maxDepth}) reached. Stopping import processing.`,
);
}
return content;
}
// Regex to match @path/to/file imports (supports any file extension)
// Supports both @path/to/file.md and @./path/to/file.md syntax
const importRegex = /@([./]?[^\s\n]+\.[^\s\n]+)/g;
let processedContent = content;
let match: RegExpExecArray | null;
// Process all imports in the content
while ((match = importRegex.exec(content)) !== null) {
const importPath = match[1];
// Validate import path to prevent path traversal attacks
if (!validateImportPath(importPath, basePath, [basePath])) {
processedContent = processedContent.replace(
match[0],
`<!-- Import failed: ${importPath} - Path traversal attempt -->`,
);
continue;
}
// Check if the import is for a non-md file and warn
if (!importPath.endsWith('.md')) {
logger.warn(
`Import processor only supports .md files. Attempting to import non-md file: ${importPath}. This will fail.`,
);
// Replace the import with a warning comment
processedContent = processedContent.replace(
match[0],
`<!-- Import failed: ${importPath} - Only .md files are supported -->`,
);
continue;
}
const fullPath = path.resolve(basePath, importPath);
if (debugMode) {
logger.debug(`Processing import: ${importPath} -> ${fullPath}`);
}
// Check for circular imports - if we're already processing this file
if (importState.currentFile === fullPath) {
if (debugMode) {
logger.warn(`Circular import detected: ${importPath}`);
}
// Replace the import with a warning comment
processedContent = processedContent.replace(
match[0],
`<!-- Circular import detected: ${importPath} -->`,
);
continue;
}
// Check if we've already processed this file in this import chain
if (importState.processedFiles.has(fullPath)) {
if (debugMode) {
logger.warn(`File already processed in this chain: ${importPath}`);
}
// Replace the import with a warning comment
processedContent = processedContent.replace(
match[0],
`<!-- File already processed: ${importPath} -->`,
);
continue;
}
// Check for potential circular imports by looking at the import chain
if (importState.currentFile) {
const currentFileDir = path.dirname(importState.currentFile);
const potentialCircularPath = path.resolve(currentFileDir, importPath);
if (potentialCircularPath === importState.currentFile) {
if (debugMode) {
logger.warn(`Circular import detected: ${importPath}`);
}
// Replace the import with a warning comment
processedContent = processedContent.replace(
match[0],
`<!-- Circular import detected: ${importPath} -->`,
);
continue;
}
}
try {
// Check if the file exists
await fs.access(fullPath);
// Read the imported file content
const importedContent = await fs.readFile(fullPath, 'utf-8');
if (debugMode) {
logger.debug(`Successfully read imported file: ${fullPath}`);
}
// Recursively process imports in the imported content
const processedImportedContent = await processImports(
importedContent,
path.dirname(fullPath),
debugMode,
{
...importState,
processedFiles: new Set([...importState.processedFiles, fullPath]),
currentDepth: importState.currentDepth + 1,
currentFile: fullPath, // Set the current file being processed
},
);
// Replace the import statement with the processed content
processedContent = processedContent.replace(
match[0],
`<!-- Imported from: ${importPath} -->\n${processedImportedContent}\n<!-- End of import from: ${importPath} -->`,
);
} catch (error) {
const errorMessage =
error instanceof Error ? error.message : String(error);
if (debugMode) {
logger.error(`Failed to import ${importPath}: ${errorMessage}`);
}
// Replace the import with an error comment
processedContent = processedContent.replace(
match[0],
`<!-- Import failed: ${importPath} - ${errorMessage} -->`,
);
}
}
return processedContent;
}
/**
* Validates import paths to ensure they are safe and within allowed directories
*
* @param importPath - The import path to validate
* @param basePath - The base directory for resolving relative paths
* @param allowedDirectories - Array of allowed directory paths
* @returns Whether the import path is valid
*/
export function validateImportPath(
importPath: string,
basePath: string,
allowedDirectories: string[],
): boolean {
// Reject URLs
if (/^(file|https?):\/\//.test(importPath)) {
return false;
}
const resolvedPath = path.resolve(basePath, importPath);
return allowedDirectories.some((allowedDir) => {
const normalizedAllowedDir = path.resolve(allowedDir);
return resolvedPath.startsWith(normalizedAllowedDir);
});
}