fix: CLAUDE.md compatibility for GEMINI.md '@' file import behavior (#2978)

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Allen Hutchison <adh@google.com>
This commit is contained in:
Niladri Das 2025-07-31 22:06:50 +05:30 committed by GitHub
parent ae86c7ba05
commit 9a6422f331
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
14 changed files with 1355 additions and 215 deletions

View File

@ -1,18 +1,14 @@
# Memory Import Processor
The Memory Import Processor is a feature that allows you to modularize your GEMINI.md files by importing content from other markdown files using the `@file.md` syntax.
The Memory Import Processor is a feature that allows you to modularize your GEMINI.md files by importing content from other files using the `@file.md` syntax.
## Overview
This feature enables you to break down large GEMINI.md files into smaller, more manageable components that can be reused across different contexts. The import processor supports both relative and absolute paths, with built-in safety features to prevent circular imports and ensure file access security.
## Important Limitations
**This feature only supports `.md` (markdown) files.** Attempting to import files with other extensions (like `.txt`, `.json`, etc.) will result in a warning and the import will fail.
## Syntax
Use the `@` symbol followed by the path to the markdown file you want to import:
Use the `@` symbol followed by the path to the file you want to import:
```markdown
# Main GEMINI.md file
@ -96,24 +92,10 @@ The `validateImportPath` function ensures that imports are only allowed from spe
### Maximum Import Depth
To prevent infinite recursion, there's a configurable maximum import depth (default: 10 levels).
To prevent infinite recursion, there's a configurable maximum import depth (default: 5 levels).
## Error Handling
### Non-MD File Attempts
If you try to import a non-markdown file, you'll see a warning:
```markdown
@./instructions.txt <!-- This will show a warning and fail -->
```
Console output:
```
[WARN] [ImportProcessor] Import processor only supports .md files. Attempting to import non-md file: ./instructions.txt. This will fail.
```
### Missing Files
If a referenced file doesn't exist, the import will fail gracefully with an error comment in the output.
@ -122,6 +104,36 @@ If a referenced file doesn't exist, the import will fail gracefully with an erro
Permission issues or other file system errors are handled gracefully with appropriate error messages.
## Code Region Detection
The import processor uses the `marked` library to detect code blocks and inline code spans, ensuring that `@` imports inside these regions are properly ignored. This provides robust handling of nested code blocks and complex Markdown structures.
## Import Tree Structure
The processor returns an import tree that shows the hierarchy of imported files, similar to Claude's `/memory` feature. This helps users debug problems with their GEMINI.md files by showing which files were read and their import relationships.
Example tree structure:
```
Memory Files
L project: GEMINI.md
L a.md
L b.md
L c.md
L d.md
L e.md
L f.md
L included.md
```
The tree preserves the order that files were imported and shows the complete import chain for debugging purposes.
## Comparison to Claude Code's `/memory` (`claude.md`) Approach
Claude Code's `/memory` feature (as seen in `claude.md`) produces a flat, linear document by concatenating all included files, always marking file boundaries with clear comments and path names. It does not explicitly present the import hierarchy, but the LLM receives all file contents and paths, which is sufficient for reconstructing the hierarchy if needed.
Note: The import tree is mainly for clarity during development and has limited relevance to LLM consumption.
## API Reference
### `processImports(content, basePath, debugMode?, importState?)`
@ -135,7 +147,25 @@ Processes import statements in GEMINI.md content.
- `debugMode` (boolean, optional): Whether to enable debug logging (default: false)
- `importState` (ImportState, optional): State tracking for circular import prevention
**Returns:** Promise<string> - Processed content with imports resolved
**Returns:** Promise<ProcessImportsResult> - Object containing processed content and import tree
### `ProcessImportsResult`
```typescript
interface ProcessImportsResult {
content: string; // The processed content with imports resolved
importTree: MemoryFile; // Tree structure showing the import hierarchy
}
```
### `MemoryFile`
```typescript
interface MemoryFile {
path: string; // The file path
imports?: MemoryFile[]; // Direct imports, in the order they were imported
}
```
### `validateImportPath(importPath, basePath, allowedDirectories)`
@ -149,6 +179,16 @@ Validates import paths to ensure they are safe and within allowed directories.
**Returns:** boolean - Whether the import path is valid
### `findProjectRoot(startDir)`
Finds the project root by searching for a `.git` directory upwards from the given start directory. Implemented as an **async** function using non-blocking file system APIs to avoid blocking the Node.js event loop.
**Parameters:**
- `startDir` (string): The directory to start searching from
**Returns:** Promise<string> - The project root directory (or the start directory if no `.git` is found)
## Best Practices
1. **Use descriptive file names** for imported components
@ -161,7 +201,7 @@ Validates import paths to ensure they are safe and within allowed directories.
### Common Issues
1. **Import not working**: Check that the file exists and has a `.md` extension
1. **Import not working**: Check that the file exists and the path is correct
2. **Circular import warnings**: Review your import structure for circular references
3. **Permission errors**: Ensure the files are readable and within allowed directories
4. **Path resolution issues**: Use absolute paths if relative paths aren't resolving correctly

21
package-lock.json generated
View File

@ -14,6 +14,7 @@
"gemini": "bundle/gemini.js"
},
"devDependencies": {
"@types/marked": "^5.0.2",
"@types/micromatch": "^4.0.9",
"@types/mime-types": "^3.0.1",
"@types/minimatch": "^5.1.2",
@ -2338,6 +2339,13 @@
"dev": true,
"license": "MIT"
},
"node_modules/@types/marked": {
"version": "5.0.2",
"resolved": "https://registry.npmjs.org/@types/marked/-/marked-5.0.2.tgz",
"integrity": "sha512-OucS4KMHhFzhz27KxmWg7J+kIYqyqoW5kdIEI319hqARQQUTqhao3M/F+uFnDXD0Rg72iDDZxZNxq5gvctmLlg==",
"dev": true,
"license": "MIT"
},
"node_modules/@types/micromatch": {
"version": "4.0.9",
"resolved": "https://registry.npmjs.org/@types/micromatch/-/micromatch-4.0.9.tgz",
@ -7687,6 +7695,18 @@
"node": ">=10"
}
},
"node_modules/marked": {
"version": "15.0.12",
"resolved": "https://registry.npmjs.org/marked/-/marked-15.0.12.tgz",
"integrity": "sha512-8dD6FusOQSrpv9Z1rdNMdlSgQOIP880DHqnohobOmYLElGEqAL/JvxvuxZO16r4HtjTlfPRDC1hbvxC9dPN2nA==",
"license": "MIT",
"bin": {
"marked": "bin/marked.js"
},
"engines": {
"node": ">= 18"
}
},
"node_modules/math-intrinsics": {
"version": "1.1.0",
"resolved": "https://registry.npmjs.org/math-intrinsics/-/math-intrinsics-1.1.0.tgz",
@ -11861,6 +11881,7 @@
"html-to-text": "^9.0.5",
"https-proxy-agent": "^7.0.6",
"ignore": "^7.0.0",
"marked": "^15.0.12",
"micromatch": "^4.0.8",
"open": "^10.1.2",
"shell-quote": "^1.8.3",

View File

@ -58,6 +58,7 @@
"LICENSE"
],
"devDependencies": {
"@types/marked": "^5.0.2",
"@types/micromatch": "^4.0.9",
"@types/mime-types": "^3.0.1",
"@types/minimatch": "^5.1.2",

View File

@ -494,6 +494,7 @@ describe('Hierarchical Memory Loading (config.ts) - Placeholder Suite', () => {
'/path/to/ext3/context1.md',
'/path/to/ext3/context2.md',
],
'tree',
{
respectGitIgnore: false,
respectGeminiIgnore: true,

View File

@ -59,6 +59,7 @@ export interface CliArgs {
experimentalAcp: boolean | undefined;
extensions: string[] | undefined;
listExtensions: boolean | undefined;
ideMode?: boolean | undefined;
ideModeFeature: boolean | undefined;
proxy: string | undefined;
includeDirectories: string[] | undefined;
@ -224,7 +225,11 @@ export async function parseArguments(): Promise<CliArgs> {
});
yargsInstance.wrap(yargsInstance.terminalWidth());
return yargsInstance.argv;
const result = yargsInstance.parseSync();
// The import format is now only controlled by settings.memoryImportFormat
// We no longer accept it as a CLI argument
return result as CliArgs;
}
// This function is now a thin wrapper around the server's implementation.
@ -236,11 +241,12 @@ export async function loadHierarchicalGeminiMemory(
fileService: FileDiscoveryService,
settings: Settings,
extensionContextFilePaths: string[] = [],
memoryImportFormat: 'flat' | 'tree' = 'tree',
fileFilteringOptions?: FileFilteringOptions,
): Promise<{ memoryContent: string; fileCount: number }> {
if (debugMode) {
logger.debug(
`CLI: Delegating hierarchical memory load to server for CWD: ${currentWorkingDirectory}`,
`CLI: Delegating hierarchical memory load to server for CWD: ${currentWorkingDirectory} (memoryImportFormat: ${memoryImportFormat})`,
);
}
@ -251,6 +257,7 @@ export async function loadHierarchicalGeminiMemory(
debugMode,
fileService,
extensionContextFilePaths,
memoryImportFormat,
fileFilteringOptions,
settings.memoryDiscoveryMaxDirs,
);
@ -266,9 +273,12 @@ export async function loadCliConfig(
argv.debug ||
[process.env.DEBUG, process.env.DEBUG_MODE].some(
(v) => v === 'true' || v === '1',
);
const ideMode = settings.ideMode ?? false;
) ||
false;
const memoryImportFormat = settings.memoryImportFormat || 'tree';
const ideMode =
(argv.ideMode ?? settings.ideMode ?? false) &&
process.env.TERM_PROGRAM === 'vscode';
const ideModeFeature =
(argv.ideModeFeature ?? settings.ideModeFeature ?? false) &&
@ -314,6 +324,7 @@ export async function loadCliConfig(
fileService,
settings,
extensionContextFilePaths,
memoryImportFormat,
fileFiltering,
);

View File

@ -98,6 +98,7 @@ export interface Settings {
summarizeToolOutput?: Record<string, SummarizeToolOutputSettings>;
vimMode?: boolean;
memoryImportFormat?: 'tree' | 'flat';
// Flag to be removed post-launch.
ideModeFeature?: boolean;

View File

@ -277,6 +277,7 @@ const App = ({ config, settings, startupWarnings = [], version }: AppProps) => {
config.getFileService(),
settings.merged,
config.getExtensionContextFilePaths(),
settings.merged.memoryImportFormat || 'tree', // Use setting or default to 'tree'
config.getFileFilteringOptions(),
);

View File

@ -92,6 +92,7 @@ export const memoryCommand: SlashCommand = {
config.getDebugMode(),
config.getFileService(),
config.getExtensionContextFilePaths(),
context.services.settings.merged.memoryImportFormat || 'tree', // Use setting or default to 'tree'
config.getFileFilteringOptions(),
context.services.settings.merged.memoryDiscoveryMaxDirs,
);

View File

@ -39,6 +39,7 @@
"html-to-text": "^9.0.5",
"https-proxy-agent": "^7.0.6",
"ignore": "^7.0.0",
"marked": "^15.0.12",
"micromatch": "^4.0.8",
"open": "^10.1.2",
"shell-quote": "^1.8.3",

View File

@ -305,10 +305,12 @@ Subdir memory
false,
new FileDiscoveryService(projectRoot),
[],
'tree',
{
respectGitIgnore: true,
respectGeminiIgnore: true,
},
200, // maxDirs parameter
);
expect(result).toEqual({
@ -334,6 +336,7 @@ My code memory
true,
new FileDiscoveryService(projectRoot),
[],
'tree', // importFormat
{
respectGitIgnore: true,
respectGeminiIgnore: true,

View File

@ -43,7 +43,7 @@ async function findProjectRoot(startDir: string): Promise<string | null> {
while (true) {
const gitPath = path.join(currentDir, '.git');
try {
const stats = await fs.stat(gitPath);
const stats = await fs.lstat(gitPath);
if (stats.isDirectory()) {
return currentDir;
}
@ -230,6 +230,7 @@ async function getGeminiMdFilePathsInternal(
async function readGeminiMdFiles(
filePaths: string[],
debugMode: boolean,
importFormat: 'flat' | 'tree' = 'tree',
): Promise<GeminiFileContent[]> {
const results: GeminiFileContent[] = [];
for (const filePath of filePaths) {
@ -237,16 +238,19 @@ async function readGeminiMdFiles(
const content = await fs.readFile(filePath, 'utf-8');
// Process imports in the content
const processedContent = await processImports(
const processedResult = await processImports(
content,
path.dirname(filePath),
debugMode,
undefined,
undefined,
importFormat,
);
results.push({ filePath, content: processedContent });
results.push({ filePath, content: processedResult.content });
if (debugMode)
logger.debug(
`Successfully read and processed imports: ${filePath} (Length: ${processedContent.length})`,
`Successfully read and processed imports: ${filePath} (Length: ${processedResult.content.length})`,
);
} catch (error: unknown) {
const isTestEnv = process.env.NODE_ENV === 'test' || process.env.VITEST;
@ -293,12 +297,13 @@ export async function loadServerHierarchicalMemory(
debugMode: boolean,
fileService: FileDiscoveryService,
extensionContextFilePaths: string[] = [],
importFormat: 'flat' | 'tree' = 'tree',
fileFilteringOptions?: FileFilteringOptions,
maxDirs: number = 200,
): Promise<{ memoryContent: string; fileCount: number }> {
if (debugMode)
logger.debug(
`Loading server hierarchical memory for CWD: ${currentWorkingDirectory}`,
`Loading server hierarchical memory for CWD: ${currentWorkingDirectory} (importFormat: ${importFormat})`,
);
// For the server, homedir() refers to the server process's home.
@ -317,7 +322,11 @@ export async function loadServerHierarchicalMemory(
if (debugMode) logger.debug('No GEMINI.md files found in hierarchy.');
return { memoryContent: '', fileCount: 0 };
}
const contentsWithPaths = await readGeminiMdFiles(filePaths, debugMode);
const contentsWithPaths = await readGeminiMdFiles(
filePaths,
debugMode,
importFormat,
);
// Pass CWD for relative path display in concatenated content
const combinedInstructions = concatenateInstructions(
contentsWithPaths,

File diff suppressed because it is too large Load Diff

View File

@ -6,6 +6,7 @@
import * as fs from 'fs/promises';
import * as path from 'path';
import { marked } from 'marked';
// Simple console logger for import processing
const logger = {
@ -29,15 +30,176 @@ interface ImportState {
currentFile?: string; // Track the current file being processed
}
/**
* Interface representing a file in the import tree
*/
export interface MemoryFile {
path: string;
imports?: MemoryFile[]; // Direct imports, in the order they were imported
}
/**
* Result of processing imports
*/
export interface ProcessImportsResult {
content: string;
importTree: MemoryFile;
}
// Helper to find the project root (looks for .git directory)
async function findProjectRoot(startDir: string): Promise<string> {
let currentDir = path.resolve(startDir);
while (true) {
const gitPath = path.join(currentDir, '.git');
try {
const stats = await fs.lstat(gitPath);
if (stats.isDirectory()) {
return currentDir;
}
} catch {
// .git not found, continue to parent
}
const parentDir = path.dirname(currentDir);
if (parentDir === currentDir) {
// Reached filesystem root
break;
}
currentDir = parentDir;
}
// Fallback to startDir if .git not found
return path.resolve(startDir);
}
// Add a type guard for error objects
function hasMessage(err: unknown): err is { message: string } {
return (
typeof err === 'object' &&
err !== null &&
'message' in err &&
typeof (err as { message: unknown }).message === 'string'
);
}
// Helper to find all code block and inline code regions using marked
/**
* Finds all import statements in content without using regex
* @returns Array of {start, _end, path} objects for each import found
*/
function findImports(
content: string,
): Array<{ start: number; _end: number; path: string }> {
const imports: Array<{ start: number; _end: number; path: string }> = [];
let i = 0;
const len = content.length;
while (i < len) {
// Find next @ symbol
i = content.indexOf('@', i);
if (i === -1) break;
// Check if it's a word boundary (not part of another word)
if (i > 0 && !isWhitespace(content[i - 1])) {
i++;
continue;
}
// Find the end of the import path (whitespace or newline)
let j = i + 1;
while (
j < len &&
!isWhitespace(content[j]) &&
content[j] !== '\n' &&
content[j] !== '\r'
) {
j++;
}
// Extract the path (everything after @)
const importPath = content.slice(i + 1, j);
// Basic validation (starts with ./ or / or letter)
if (
importPath.length > 0 &&
(importPath[0] === '.' ||
importPath[0] === '/' ||
isLetter(importPath[0]))
) {
imports.push({
start: i,
_end: j,
path: importPath,
});
}
i = j + 1;
}
return imports;
}
function isWhitespace(char: string): boolean {
return char === ' ' || char === '\t' || char === '\n' || char === '\r';
}
function isLetter(char: string): boolean {
const code = char.charCodeAt(0);
return (
(code >= 65 && code <= 90) || // A-Z
(code >= 97 && code <= 122)
); // a-z
}
function findCodeRegions(content: string): Array<[number, number]> {
const regions: Array<[number, number]> = [];
const tokens = marked.lexer(content);
// Map from raw content to a queue of its start indices in the original content.
const rawContentIndices = new Map<string, number[]>();
function walk(token: { type: string; raw: string; tokens?: unknown[] }) {
if (token.type === 'code' || token.type === 'codespan') {
if (!rawContentIndices.has(token.raw)) {
const indices: number[] = [];
let lastIndex = -1;
while ((lastIndex = content.indexOf(token.raw, lastIndex + 1)) !== -1) {
indices.push(lastIndex);
}
rawContentIndices.set(token.raw, indices);
}
const indices = rawContentIndices.get(token.raw);
if (indices && indices.length > 0) {
// Assume tokens are processed in order of appearance.
// Dequeue the next available index for this raw content.
const idx = indices.shift()!;
regions.push([idx, idx + token.raw.length]);
}
}
if ('tokens' in token && token.tokens) {
for (const child of token.tokens) {
walk(child as { type: string; raw: string; tokens?: unknown[] });
}
}
}
for (const token of tokens) {
walk(token);
}
return regions;
}
/**
* Processes import statements in GEMINI.md content
* Supports @path/to/file.md syntax for importing content from other files
*
* Supports @path/to/file syntax for importing content from other files
* @param content - The content to process for imports
* @param basePath - The directory path where the current file is located
* @param debugMode - Whether to enable debug logging
* @param importState - State tracking for circular import prevention
* @returns Processed content with imports resolved
* @param projectRoot - The project root directory for allowed directories
* @param importFormat - The format of the import tree
* @returns Processed content with imports resolved and import tree
*/
export async function processImports(
content: string,
@ -45,156 +207,198 @@ export async function processImports(
debugMode: boolean = false,
importState: ImportState = {
processedFiles: new Set(),
maxDepth: 10,
maxDepth: 5,
currentDepth: 0,
},
): Promise<string> {
projectRoot?: string,
importFormat: 'flat' | 'tree' = 'tree',
): Promise<ProcessImportsResult> {
if (!projectRoot) {
projectRoot = await findProjectRoot(basePath);
}
if (importState.currentDepth >= importState.maxDepth) {
if (debugMode) {
logger.warn(
`Maximum import depth (${importState.maxDepth}) reached. Stopping import processing.`,
);
}
return content;
return {
content,
importTree: { path: importState.currentFile || 'unknown' },
};
}
// Regex to match @path/to/file imports (supports any file extension)
// Supports both @path/to/file.md and @./path/to/file.md syntax
const importRegex = /@([./]?[^\s\n]+\.[^\s\n]+)/g;
// --- FLAT FORMAT LOGIC ---
if (importFormat === 'flat') {
// Use a queue to process files in order of first encounter, and a set to avoid duplicates
const flatFiles: Array<{ path: string; content: string }> = [];
// Track processed files across the entire operation
const processedFiles = new Set<string>();
let processedContent = content;
let match: RegExpExecArray | null;
// Helper to recursively process imports
async function processFlat(
fileContent: string,
fileBasePath: string,
filePath: string,
depth: number,
) {
// Normalize the file path to ensure consistent comparison
const normalizedPath = path.normalize(filePath);
// Process all imports in the content
while ((match = importRegex.exec(content)) !== null) {
const importPath = match[1];
// Skip if already processed
if (processedFiles.has(normalizedPath)) return;
// Validate import path to prevent path traversal attacks
if (!validateImportPath(importPath, basePath, [basePath])) {
processedContent = processedContent.replace(
match[0],
`<!-- Import failed: ${importPath} - Path traversal attempt -->`,
);
continue;
}
// Mark as processed before processing to prevent infinite recursion
processedFiles.add(normalizedPath);
// Check if the import is for a non-md file and warn
if (!importPath.endsWith('.md')) {
logger.warn(
`Import processor only supports .md files. Attempting to import non-md file: ${importPath}. This will fail.`,
);
// Replace the import with a warning comment
processedContent = processedContent.replace(
match[0],
`<!-- Import failed: ${importPath} - Only .md files are supported -->`,
);
continue;
}
// Add this file to the flat list
flatFiles.push({ path: normalizedPath, content: fileContent });
const fullPath = path.resolve(basePath, importPath);
// Find imports in this file
const codeRegions = findCodeRegions(fileContent);
const imports = findImports(fileContent);
if (debugMode) {
logger.debug(`Processing import: ${importPath} -> ${fullPath}`);
}
// Process imports in reverse order to handle indices correctly
for (let i = imports.length - 1; i >= 0; i--) {
const { start, _end, path: importPath } = imports[i];
// Check for circular imports - if we're already processing this file
if (importState.currentFile === fullPath) {
if (debugMode) {
logger.warn(`Circular import detected: ${importPath}`);
}
// Replace the import with a warning comment
processedContent = processedContent.replace(
match[0],
`<!-- Circular import detected: ${importPath} -->`,
);
continue;
}
// Check if we've already processed this file in this import chain
if (importState.processedFiles.has(fullPath)) {
if (debugMode) {
logger.warn(`File already processed in this chain: ${importPath}`);
}
// Replace the import with a warning comment
processedContent = processedContent.replace(
match[0],
`<!-- File already processed: ${importPath} -->`,
);
continue;
}
// Check for potential circular imports by looking at the import chain
if (importState.currentFile) {
const currentFileDir = path.dirname(importState.currentFile);
const potentialCircularPath = path.resolve(currentFileDir, importPath);
if (potentialCircularPath === importState.currentFile) {
if (debugMode) {
logger.warn(`Circular import detected: ${importPath}`);
// Skip if inside a code region
if (
codeRegions.some(
([regionStart, regionEnd]) =>
start >= regionStart && start < regionEnd,
)
) {
continue;
}
// Validate import path
if (
!validateImportPath(importPath, fileBasePath, [projectRoot || ''])
) {
continue;
}
const fullPath = path.resolve(fileBasePath, importPath);
const normalizedFullPath = path.normalize(fullPath);
// Skip if already processed
if (processedFiles.has(normalizedFullPath)) continue;
try {
await fs.access(fullPath);
const importedContent = await fs.readFile(fullPath, 'utf-8');
// Process the imported file
await processFlat(
importedContent,
path.dirname(fullPath),
normalizedFullPath,
depth + 1,
);
} catch (error) {
if (debugMode) {
logger.warn(
`Failed to import ${fullPath}: ${hasMessage(error) ? error.message : 'Unknown error'}`,
);
}
// Continue with other imports even if one fails
}
// Replace the import with a warning comment
processedContent = processedContent.replace(
match[0],
`<!-- Circular import detected: ${importPath} -->`,
);
continue;
}
}
// Start with the root file (current file)
const rootPath = path.normalize(
importState.currentFile || path.resolve(basePath),
);
await processFlat(content, basePath, rootPath, 0);
// Concatenate all unique files in order, Claude-style
const flatContent = flatFiles
.map(
(f) =>
`--- File: ${f.path} ---\n${f.content.trim()}\n--- End of File: ${f.path} ---`,
)
.join('\n\n');
return {
content: flatContent,
importTree: { path: rootPath }, // Tree not meaningful in flat mode
};
}
// --- TREE FORMAT LOGIC (existing) ---
const codeRegions = findCodeRegions(content);
let result = '';
let lastIndex = 0;
const imports: MemoryFile[] = [];
const importsList = findImports(content);
for (const { start, _end, path: importPath } of importsList) {
// Add content before this import
result += content.substring(lastIndex, start);
lastIndex = _end;
// Skip if inside a code region
if (codeRegions.some(([s, e]) => start >= s && start < e)) {
result += `@${importPath}`;
continue;
}
// Validate import path to prevent path traversal attacks
if (!validateImportPath(importPath, basePath, [projectRoot || ''])) {
result += `<!-- Import failed: ${importPath} - Path traversal attempt -->`;
continue;
}
const fullPath = path.resolve(basePath, importPath);
if (importState.processedFiles.has(fullPath)) {
result += `<!-- File already processed: ${importPath} -->`;
continue;
}
try {
// Check if the file exists
await fs.access(fullPath);
// Read the imported file content
const importedContent = await fs.readFile(fullPath, 'utf-8');
if (debugMode) {
logger.debug(`Successfully read imported file: ${fullPath}`);
}
// Recursively process imports in the imported content
const processedImportedContent = await processImports(
importedContent,
const fileContent = await fs.readFile(fullPath, 'utf-8');
// Mark this file as processed for this import chain
const newImportState: ImportState = {
...importState,
processedFiles: new Set(importState.processedFiles),
currentDepth: importState.currentDepth + 1,
currentFile: fullPath,
};
newImportState.processedFiles.add(fullPath);
const imported = await processImports(
fileContent,
path.dirname(fullPath),
debugMode,
{
...importState,
processedFiles: new Set([...importState.processedFiles, fullPath]),
currentDepth: importState.currentDepth + 1,
currentFile: fullPath, // Set the current file being processed
},
newImportState,
projectRoot,
importFormat,
);
// Replace the import statement with the processed content
processedContent = processedContent.replace(
match[0],
`<!-- Imported from: ${importPath} -->\n${processedImportedContent}\n<!-- End of import from: ${importPath} -->`,
);
} catch (error) {
const errorMessage =
error instanceof Error ? error.message : String(error);
if (debugMode) {
logger.error(`Failed to import ${importPath}: ${errorMessage}`);
result += `<!-- Imported from: ${importPath} -->\n${imported.content}\n<!-- End of import from: ${importPath} -->`;
imports.push(imported.importTree);
} catch (err: unknown) {
let message = 'Unknown error';
if (hasMessage(err)) {
message = err.message;
} else if (typeof err === 'string') {
message = err;
}
// Replace the import with an error comment
processedContent = processedContent.replace(
match[0],
`<!-- Import failed: ${importPath} - ${errorMessage} -->`,
);
logger.error(`Failed to import ${importPath}: ${message}`);
result += `<!-- Import failed: ${importPath} - ${message} -->`;
}
}
// Add any remaining content after the last match
result += content.substring(lastIndex);
return processedContent;
return {
content: result,
importTree: {
path: importState.currentFile || 'unknown',
imports: imports.length > 0 ? imports : undefined,
},
};
}
/**
* Validates import paths to ensure they are safe and within allowed directories
*
* @param importPath - The import path to validate
* @param basePath - The base directory for resolving relative paths
* @param allowedDirectories - Array of allowed directory paths
* @returns Whether the import path is valid
*/
export function validateImportPath(
importPath: string,
basePath: string,
@ -209,6 +413,8 @@ export function validateImportPath(
return allowedDirectories.some((allowedDir) => {
const normalizedAllowedDir = path.resolve(allowedDir);
return resolvedPath.startsWith(normalizedAllowedDir);
const isSamePath = resolvedPath === normalizedAllowedDir;
const isSubPath = resolvedPath.startsWith(normalizedAllowedDir + path.sep);
return isSamePath || isSubPath;
});
}

View File

@ -0,0 +1,51 @@
/**
* @license
* Copyright 2025 Google LLC
* SPDX-License-Identifier: Apache-2.0
*/
import path from 'path';
import { fileURLToPath } from 'url';
// Test how paths are normalized
function testPathNormalization() {
// Use platform-agnostic path construction instead of hardcoded paths
const testPath = path.join('test', 'project', 'src', 'file.md');
const absoluteTestPath = path.resolve('test', 'project', 'src', 'file.md');
console.log('Testing path normalization:');
console.log('Relative path:', testPath);
console.log('Absolute path:', absoluteTestPath);
// Test path.join with different segments
const joinedPath = path.join('test', 'project', 'src', 'file.md');
console.log('Joined path:', joinedPath);
// Test path.normalize
console.log('Normalized relative path:', path.normalize(testPath));
console.log('Normalized absolute path:', path.normalize(absoluteTestPath));
// Test how the test would see these paths
const testContent = `--- File: ${absoluteTestPath} ---\nContent\n--- End of File: ${absoluteTestPath} ---`;
console.log('\nTest content with platform-agnostic paths:');
console.log(testContent);
// Try to match with different patterns
const marker = `--- File: ${absoluteTestPath} ---`;
console.log('\nTrying to match:', marker);
console.log('Direct match:', testContent.includes(marker));
// Test with normalized path in marker
const normalizedMarker = `--- File: ${path.normalize(absoluteTestPath)} ---`;
console.log(
'Normalized marker match:',
testContent.includes(normalizedMarker),
);
// Test path resolution
const __filename = fileURLToPath(import.meta.url);
console.log('\nCurrent file path:', __filename);
console.log('Directory name:', path.dirname(__filename));
}
testPathNormalization();