Library constantly increases memory usage in long-running applications #959

Open
opened 2024-06-10 01:56:55 -05:00 by pPrecel · 1 comment
pPrecel commented 2024-06-10 01:56:55 -05:00 (Migrated from github.com)

Description:

I observed in one long-running application that if the application uses the git2go library to clone the repo or compute the last commit hash from branch/tag then memory is constantly increasing. We run our application in Kubernetes so it leads the pod to out of memory error.

I prepared the hello world application that can easily reproduce the problem. The application clones the repo in the infinite loop and after that, it will remove the tmp directory and run the Free method on every git2go object:

package main

import (
	"fmt"
	"os"

	git2go "github.com/libgit2/git2go/v34"
)

func main() {
	fmt.Println("starting...")

	iter := 1
	for {
		fmt.Printf("iteration %d...\n", iter)

		err := fetchRepo("https://github.com/kyma-project/serverless")
		if err != nil {
			fmt.Printf("WARN: %s\n", err.Error())
		}

		iter++
	}

}

const (
	branchRefPattern = "refs/remotes/origin"
)

func fetchRepo(repoUrl string) error {
	// create tmp dir
	repoDir := "/tmp/git2go_test_"
	err := os.MkdirAll(repoDir, 0700)
	if err != nil {
		return err
	}
	defer os.RemoveAll(repoDir)

	// init repo structure
	repo, err := git2go.InitRepository(repoDir, true)
	if err != nil {
		return err
	}
	defer repo.Free()

	// create/reuse remote
	remote, err := lookupCreateRemote(repo, repoUrl)
	if err != nil {
		return err
	}
	defer remote.Free()

	// fetch remote
	err = remote.Fetch(nil,
		&git2go.FetchOptions{
			DownloadTags: git2go.DownloadTagsAll,
		}, "")
	if err != nil {
		return err
	}

	return nil
}

// lookupCreateRemote looks up the remote with the given name, if it doesn't exist it creates it
func lookupCreateRemote(repo *git2go.Repository, url string) (*git2go.Remote, error) {
	remote, err := repo.Remotes.Lookup("origin")
	if err == nil {
		return remote, nil
	}

	return repo.Remotes.Create("origin", url)
}

The app can be built by running docker build -t <tag> . on this Dockerfile (or you can simply reuse my image pprecel/git2go:latest):

FROM golang:1.22.3-alpine3.20 as builder

WORKDIR /app

RUN apk add --no-cache gcc libc-dev
RUN apk add --no-cache --repository http://dl-cdn.alpinelinux.org/alpine/v3.18/community libgit2-dev=1.5.2-r0

COPY . /app

RUN pwd

RUN go mod tidy
RUN CGO_ENABLED=1 GOOS=linux GOARCH=amd64 go build -o /app/main /app/main.go

FROM alpine:latest

RUN apk add --no-cache --repository http://dl-cdn.alpinelinux.org/alpine/v3.18/community libgit2-dev=1.5.2-r0

COPY --from=builder /app /app

CMD ["/app/main"]

The application can be run in every container ecosystem, so docker or Kubernetes systems will show the right results:

docker run -d --name git2go-2 pprecel/git2go:latest

or

kubectl run git2go --image=pprecel/git2go:latest

On my machine, I observed that after a night the memory usage increased from 28Mi to almost 3000Mi and it's still increasing.
. Example:

kubectl top pods
NAME                       CPU(cores)   MEMORY(bytes)
git2go                       532m            20Mi

after a night

kubectl top pods
NAME                       CPU(cores)   MEMORY(bytes)
git2go                       532m            3012Mi
**Description:** I observed in one long-running application that if the application uses the git2go library to [clone](https://github.com/kyma-project/serverless/blob/main/components/serverless/internal/git/go2git.go#L108) the repo or compute the [last commit](https://github.com/kyma-project/serverless/blob/main/components/serverless/internal/git/go2git.go#L68) hash from branch/tag then memory is constantly increasing. We run our application in Kubernetes so it leads the pod to `out of memory` error. I prepared the `hello world` application that can easily reproduce the problem. The application clones the repo in the infinite loop and after that, it will remove the tmp directory and run the `Free` method on every git2go object: ```main.go package main import ( "fmt" "os" git2go "github.com/libgit2/git2go/v34" ) func main() { fmt.Println("starting...") iter := 1 for { fmt.Printf("iteration %d...\n", iter) err := fetchRepo("https://github.com/kyma-project/serverless") if err != nil { fmt.Printf("WARN: %s\n", err.Error()) } iter++ } } const ( branchRefPattern = "refs/remotes/origin" ) func fetchRepo(repoUrl string) error { // create tmp dir repoDir := "/tmp/git2go_test_" err := os.MkdirAll(repoDir, 0700) if err != nil { return err } defer os.RemoveAll(repoDir) // init repo structure repo, err := git2go.InitRepository(repoDir, true) if err != nil { return err } defer repo.Free() // create/reuse remote remote, err := lookupCreateRemote(repo, repoUrl) if err != nil { return err } defer remote.Free() // fetch remote err = remote.Fetch(nil, &git2go.FetchOptions{ DownloadTags: git2go.DownloadTagsAll, }, "") if err != nil { return err } return nil } // lookupCreateRemote looks up the remote with the given name, if it doesn't exist it creates it func lookupCreateRemote(repo *git2go.Repository, url string) (*git2go.Remote, error) { remote, err := repo.Remotes.Lookup("origin") if err == nil { return remote, nil } return repo.Remotes.Create("origin", url) } ``` The app can be built by running `docker build -t <tag> .` on this Dockerfile (or you can simply reuse my image `pprecel/git2go:latest`): ```Dockerfile FROM golang:1.22.3-alpine3.20 as builder WORKDIR /app RUN apk add --no-cache gcc libc-dev RUN apk add --no-cache --repository http://dl-cdn.alpinelinux.org/alpine/v3.18/community libgit2-dev=1.5.2-r0 COPY . /app RUN pwd RUN go mod tidy RUN CGO_ENABLED=1 GOOS=linux GOARCH=amd64 go build -o /app/main /app/main.go FROM alpine:latest RUN apk add --no-cache --repository http://dl-cdn.alpinelinux.org/alpine/v3.18/community libgit2-dev=1.5.2-r0 COPY --from=builder /app /app CMD ["/app/main"] ``` The application can be run in every container ecosystem, so docker or Kubernetes systems will show the right results: ``` docker run -d --name git2go-2 pprecel/git2go:latest ``` or ``` kubectl run git2go --image=pprecel/git2go:latest ``` On my machine, I observed that after a night the memory usage increased from 28Mi to almost 3000Mi and it's still increasing. . Example: ``` kubectl top pods NAME CPU(cores) MEMORY(bytes) git2go 532m 20Mi ``` after a night ``` kubectl top pods NAME CPU(cores) MEMORY(bytes) git2go 532m 3012Mi ```
pPrecel commented 2024-06-10 02:34:12 -05:00 (Migrated from github.com)

The output from the ContainerWatch extension:

Screenshot 2024-06-10 at 09 33 05
The output from the `ContainerWatch` extension: <img width="1728" alt="Screenshot 2024-06-10 at 09 33 05" src="https://github.com/libgit2/git2go/assets/28900355/f12aa14e-2cfb-42a5-955a-e48284f63e82">
Sign in to join this conversation.
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: jcarr/git2go#959
No description provided.