[{"data":1,"prerenderedAt":818},["ShallowReactive",2],{"/en-us/blog/partial-clone-for-massive-repositories":3,"navigation-en-us":38,"banner-en-us":449,"footer-en-us":459,"blog-post-authors-en-us-James Ramsay":701,"blog-related-posts-en-us-partial-clone-for-massive-repositories":715,"blog-promotions-en-us":754,"next-steps-en-us":808},{"id":4,"title":5,"authorSlugs":6,"authors":8,"body":10,"category":11,"categorySlug":11,"config":12,"content":16,"date":20,"description":17,"extension":24,"externalUrl":25,"featured":14,"heroImage":19,"isFeatured":14,"meta":26,"navigation":27,"path":28,"publishedDate":20,"rawbody":29,"seo":30,"slug":13,"stem":34,"tagSlugs":35,"tags":36,"template":15,"updatedDate":25,"__hash__":37},"blogPosts/en-us/blog/partial-clone-for-massive-repositories.yml","How Git Partial Clone lets you fetch only the large file you need",[7],"james-ramsay",[9],"James Ramsay","The Git project began nearly 15 years ago, on [April 7,\n2005](https://marc.info/?l=linux-kernel&m=111288700902396), and is now the\n[version control system](/topics/version-control/) of choice for developers. Yet, there are certain types of projects that\noften do not use Git, particularly projects that have many large binary files,\nsuch as video games. One reason projects with large binary files don't use Git\nis because, when a Git repository is cloned, Git will download every version of\nevery file in the repo. For most use cases, downloading this history is a\nuseful feature, but it slows cloning and fetching for projects with large binary\nfiles, assuming the project even fits on your computer.\n\n## What is Partial Clone?\n\nPartial Clone is a new feature of Git that replaces [Git\nLFS](https://git-lfs.github.com/) and makes working with very large repositories\nbetter by teaching Git how to work without downloading every file. Partial Clone\nhas been\n[years](https://public-inbox.org/git/xmqqeg4o27zw.fsf@gitster.mtv.corp.google.com/)\nin the making, with code contributions from GitLab, GitHub, Microsoft and\nGoogle. Today it is experimentally available in Git and GitLab, and can be\nenabled by administrators\n([docs](https://docs.gitlab.com/topics/git/partial_clone/)).\n\nPartial Clone speeds up fetching and cloning because less data is\ntransferred, and reduces disk usage on your local computer. For example, cloning\n[`gitlab-com/www-gitlab-com`](https://gitlab.com/gitlab-com/www-gitlab-com)\nusing Partial Clone (`--filter=blob:none`) is at least 50% faster, and transfers\n70% less data.\n\nNote: Partial Clone is one specific performance optimization for very large\nrepositories. [Sparse\nCheckout](https://github.blog/2020-01-17-bring-your-monorepo-down-to-size-with-sparse-checkout/)\nis a related optimization that is particularly focused on repositories with\ntremendously large numbers of files and revisions such as\n[Windows](https://devblogs.microsoft.com/bharry/the-largest-git-repo-on-the-planet/)\ncode base.\n\n## A brief history of large files\n\n\"What about Git LFS?\" you may ask. Doesn't LFS stand for \"large file storage\"?\n\nPreviously, extra tools were required to store large files in Git. In 2010,\n[git-annex](https://git-annex.branchable.com/) was released, and five years\nlater in 2015, [Git LFS](https://git-lfs.github.com/) was released. Both\ngit-annex and Git LFS added large file support to Git in a similar way: Instead\nof storing a large file in Git, store a pointer file that links to the large\nfile. Then, when someone needs a large file, they can download it on-demand\nusing the pointer.\n\nThe criticism of this approach is that there are now two places to store files,\nin Git or in Git LFS. Which means that everyone must remember that big files need\nto go in Git LFS to keep the Git repo small and fast. There are downsides to\nthis approach. Besides being susceptible to human error, the pointer encodes\ndecisions based on bandwidth and file type into the structure of the repository\nthat influence all the people using the repository. Our assumptions about\nbandwidth and storage are likely to change over time, and vary by the location,\nbut decisions encoded in the repository are not flexible. Administrators and\ndevelopers alike benefit from flexibility in where to store large files, and\nwhich files to download.\n\nPartial Clone solves these problems by removing the need for two classes of\nstorage, and special pointers. Let's walk through an example to understand how.\n\n## How to get started with Partial Clone\n\nLet's continue to use `gitlab-com/www-gitlab-com` as an example project, since\nit has quite a lot of images. For a larger repository, like a video game with\ndetailed textures and models that could take up a lot of disk space, the benefits will be even more significant.\n\nInstead of a vanilla `git clone`, we will include a filter spec which controls\nwhat is excluded when fetching data. In this situation, we just want to exclude\nlarge binary files. I've included `--no-checkout` so we can more clearly observe\nwhat is happening.\n\n```bash\ngit clone --filter=blob:none --no-checkout git@gitlab.com/gitlab-com/www-gitlab-com.git\n# Cloning into 'www-gitlab-com'...\n# remote: Enumerating objects: 624541, done.\n# remote: Counting objects: 100% (624541/624541), done.\n# remote: Compressing objects: 100% (151886/151886), done.\n# remote: Total 624541 (delta 432983), reused 622339 (delta 430843), pack-reused 0\n# Receiving objects: 100% (624541/624541), 74.61 MiB | 8.14 MiB/s, done.\n# Resolving deltas: 100% (432983/432983), done.\n```\n\nAbove we explicitly told Git not to checkout the default branch. Normally\n`checkout` doesn't require fetching any data from the server, because we have\neverything locally. When using Partial Clone, since we are deliberately not downloading everything, Git will need to fetch any missing files when doing a\ncheckout.\n\n```bash\ngit checkout master\n# remote: Enumerating objects: 12080, done.\n# remote: Counting objects: 100% (12080/12080), done.\n# remote: Compressing objects: 100% (11640/11640), done.\n# remote: Total 12080 (delta 442), reused 9773 (delta 409), pack-reused 0\n# Receiving objects: 100% (12080/12080), 1.10 GiB | 8.49 MiB/s, done.\n# Resolving deltas: 100% (442/442), done.\n# Updating files: 100% (12342/12342), done.\n# Filtering content: 100% (3/3), 131.24 MiB | 4.73 MiB/s, done.\n```\n\nIf we checkout a different branch or commit, we'll need to download more missing\nfiles.\n\n```bash\ngit checkout 92d1f39b60f957d0bc3c5621bb3e17a3984bdf72\n# remote: Enumerating objects: 1968, done.\n# remote: Counting objects: 100% (1968/1968), done.\n# remote: Compressing objects: 100% (1953/1953), done.\n# remote: Total 1968 (delta 23), reused 1623 (delta 15), pack-reused 0\n# Receiving objects: 100% (1968/1968), 327.44 MiB | 8.83 MiB/s, done.\n# Resolving deltas: 100% (23/23), done.\n# Updating files: 100% (2255/2255), done.\n# Note: switching to '92d1f39b60f957d0bc3c5621bb3e17a3984bdf72'.\n```\n\nGit remembers the filter spec we provided when cloning the repository so that\nfetching updates will also exclude large files until we need them.\n\n```bash\ngit config remote.origin.promisor\n# true\n\ngit config remote.origin.partialclonefilter\n# blob:none\n```\n\nWhen committing changes, you simply commit binary files like you would any other\nfile. There is no extra tool to install or configure, no need to treat big files\ndifferently to small files.\n\n## Network and Storage\n\nIf you are already using [Git LFS](https://git-lfs.github.com/) today, you might\nbe aware that large files are stored and transferred differently to regular Git\nobjects. On GitLab.com, Git LFS objects are stored in object storage (like AWS\nS3) rather than fast attached storage (like SSD), and transferred over HTTP even\nwhen using SSH for regular Git objects. Using object storage has the advantage\nof reducing storage costs for large binary files, while using simpler HTTP\nrequests for large downloads allows the possibility of resumable and parallel\ndownloads.\n\nPartial Clone\n[already](https://public-inbox.org/git/20190625134039.21707-1-chriscool@tuxfamily.org/)\nsupports more than one remote, and work is underway to allow large files to be\nstored in a different location such as object storage. Unlike Git LFS, however,\nthe repository or instance administrator will be able to choose which objects\nshould be stored where, and change this configuration over time if needed.\n\nFollow the epic for [improved large file\nstorage](https://gitlab.com/groups/gitlab-org/-/epics/1487) to learn more and\nfollow our progress.\n\n## Performance\n\nWhen fetching new objects from the Git server using a [filter\nspec](https://github.com/git/git/blob/v2.25.0/Documentation/rev-list-options.txt#L735)\n to exclude objects from the response, Git will check each object and exclude\n any that match the filter spec. In [Git\n 2.25](https://raw.githubusercontent.com/git/git/master/Documentation/RelNotes/2.25.0.txt),\n the most recent version, filtering has not been optimized for performance.\n\n[Jeff King (Peff)](https://github.com/peff/) (GitHub) recently\n[contributed](https://public-inbox.org/git/20200214182147.GA654525@coredump.intra.peff.net/)\nperformance improvements for blob size filtering, which will likely be included\nin [Git 2.26](https://gitlab.com/gitlab-org/gitaly/issues/2497), and our plan is\nto include it in GitLab 12.10 release.\n\nOptimizing the sparse filter spec option (`--filter:sparse`), which filters\nbased on file path is more complex because blobs, which contain the file\ncontent, do not include file path information. The directory structure of a\nrepository is stored in tree objects.\n\nFollow the epic for [Partial Clone performance\nimprovements](https://gitlab.com/groups/gitlab-org/-/epics/1671) to learn more\nand follow our progress.\n\n## Usability\n\nOne of the drawbacks of Git LFS was that it required installing an additional\ntool. In comparison, Partial Clone does not require any additional tools.\nHowever, it does require learning new options and configurations, such as to\nclone using the `--filter` option.\n\nWe want to make it easy for people get their work done, who simply desire Git to\njust work. They shouldn't need to work out which is the optimal blob size filter\nspec for a project? Or what even is a filter spec?  While Partial Clone remains\nexperimental, we haven't made any changes to the GitLab interface to highlight\nPartial Clone, but we are investigating this and welcome your feedback. Please\njoin the conversation on this\n[issue](https://gitlab.com/gitlab-org/gitlab/issues/207744).\n\n## File locking and tool integrations\n\nAny conversation of large binary files, particularly in regards to video\ngames is incomplete without discussing file locking and tooling integrations.\n\nUnlike plain text [source code](/solutions/source-code-management/), resolving conflicts between different versions of\na binary file is often impossible. To prevent conflicts in binary file editing,\nan exclusive file lock is used, meaning only one person at a time can edit a\nsingle file, regardless of branches. If conflicts can't be resolved, allowing multiple\nversions of a individual file to be created in parallel on different branches is a bug, not\na feature. GitLab already has basic file locking support, but it is really only\nuseful for plain text because it only applies to the default branch, and is not\nintegrated with any local tools.\n\nLocal tooling integrations are important for binary asset workflows, to\nautomatically propagate file locks to the local development environment, and to\nallow artists to work on assets without needing to use Git from the command\nline. Propagating file locks quickly to local development environments is also\nimportant because it prevents work from being wasted before it even happens.\n\nFollow the [file locking](https://gitlab.com/groups/gitlab-org/-/epics/1488) and\n[integrations](https://gitlab.com/groups/gitlab-org/-/epics/2704) epics for more\ninformation about what we're working on.\n\n## Conclusion\n\nLarge files are necessary for many projects, and Git will soon support this\nnatively, without the need for extra tools. Although Partial Clone is still an\nexperimental feature, we are making improvements with every release and the\nfeature is now ready for testing.\n\nThank you to the Git community for your work over the past years on improving\nsupport for enormous repositories. Particularly, thank you to [Jeff\nKing](https://github.com/peff/) (GitHub) and [Christian\nCouder](https://about.gitlab.com/company/team/#chriscool) (senior backend\nengineer on Gitaly at GitLab) for your early experimentation with Partial Clone,\nJonathan Tan (Google) and [Jeff Hostetler](https://github.com/jeffhostetler)\n(Microsoft) for contributing the [first\nimplementation](https://public-inbox.org/git/cover.1506714999.git.jonathantanmy@google.com/)\nof Partial Clone and promisor remotes, and the many others who've also\ncontributed.\n\nIf you are already using Partial Clone, or would like to help us test Partial\nClone on a large project, please get in touch with me, [James\nRamsay](https://about.gitlab.com/company/team/#jramsay) (group manager, product\nfor Create at GitLab), [Jordi\nMon](https://about.gitlab.com/company/team/#jordi_mon) (senior product marketing\nmanager for Dev at GitLab), or your account manager.\n\nFor more information on Partial Clone, check out [the documentation](https://docs.gitlab.com/topics/git/partial_clone/).\n\nCover image by [Simon Boxus](https://unsplash.com/@simonlerouge) on\n[Unsplash](https://unsplash.com/photos/4ftI4lCcByM)\n","open-source",{"slug":13,"featured":14,"template":15},"partial-clone-for-massive-repositories",false,"BlogPost",{"title":5,"description":17,"authors":18,"heroImage":19,"date":20,"body":10,"category":11,"tags":21},"Work faster with this experimental Partial Clone feature for huge Git repositories, saving you time, bandwidth, and storage, one large file at a time.",[9],"https://res.cloudinary.com/about-gitlab-com/image/upload/v1749681131/Blog/Hero%20Images/partial-clone-for-massive-repositories.jpg","2020-03-13",[22,23],"git","performance","yml",null,{},true,"/en-us/blog/partial-clone-for-massive-repositories","seo:\n  title: How Git Partial Clone lets you fetch only the large file you need\n  description: >-\n    Work faster with this experimental Partial Clone feature for huge Git\n    repositories, saving you time, bandwidth, and storage, one large file at a\n    time.\n  ogTitle: How Git Partial Clone lets you fetch only the large file you need\n  ogDescription: >-\n    Work faster with this experimental Partial Clone feature for huge Git\n    repositories, saving you time, bandwidth, and storage, one large file at a\n    time.\n  noIndex: false\n  ogImage: >-\n    https://res.cloudinary.com/about-gitlab-com/image/upload/v1749681131/Blog/Hero%20Images/partial-clone-for-massive-repositories.jpg\n  ogUrl: https://about.gitlab.com/blog/partial-clone-for-massive-repositories\n  ogSiteName: https://about.gitlab.com\n  ogType: article\n  canonicalUrls: https://about.gitlab.com/blog/partial-clone-for-massive-repositories\ncontent:\n  title: How Git Partial Clone lets you fetch only the large file you need\n  description: >-\n    Work faster with this experimental Partial Clone feature for huge Git\n    repositories, saving you time, bandwidth, and storage, one large file at a\n    time.\n  authors:\n    - James Ramsay\n  heroImage: >-\n    https://res.cloudinary.com/about-gitlab-com/image/upload/v1749681131/Blog/Hero%20Images/partial-clone-for-massive-repositories.jpg\n  date: '2020-03-13'\n  body: >\n    The Git project began nearly 15 years ago, on [April 7,\n\n    2005](https://marc.info/?l=linux-kernel&m=111288700902396), and is now the\n\n    [version control system](/topics/version-control/) of choice for developers.\n    Yet, there are certain types of projects that\n\n    often do not use Git, particularly projects that have many large binary\n    files,\n\n    such as video games. One reason projects with large binary files don't use\n    Git\n\n    is because, when a Git repository is cloned, Git will download every version\n    of\n\n    every file in the repo. For most use cases, downloading this history is a\n\n    useful feature, but it slows cloning and fetching for projects with large\n    binary\n\n    files, assuming the project even fits on your computer.\n\n\n    ## What is Partial Clone?\n\n\n    Partial Clone is a new feature of Git that replaces [Git\n\n    LFS](https://git-lfs.github.com/) and makes working with very large\n    repositories\n\n    better by teaching Git how to work without downloading every file. Partial\n    Clone\n\n    has been\n\n    [years](https://public-inbox.org/git/xmqqeg4o27zw.fsf@gitster.mtv.corp.google.com/)\n\n    in the making, with code contributions from GitLab, GitHub, Microsoft and\n\n    Google. Today it is experimentally available in Git and GitLab, and can be\n\n    enabled by administrators\n\n    ([docs](https://docs.gitlab.com/topics/git/partial_clone/)).\n\n\n    Partial Clone speeds up fetching and cloning because less data is\n\n    transferred, and reduces disk usage on your local computer. For example,\n    cloning\n\n    [`gitlab-com/www-gitlab-com`](https://gitlab.com/gitlab-com/www-gitlab-com)\n\n    using Partial Clone (`--filter=blob:none`) is at least 50% faster, and\n    transfers\n\n    70% less data.\n\n\n    Note: Partial Clone is one specific performance optimization for very large\n\n    repositories. [Sparse\n\n    Checkout](https://github.blog/2020-01-17-bring-your-monorepo-down-to-size-with-sparse-checkout/)\n\n    is a related optimization that is particularly focused on repositories with\n\n    tremendously large numbers of files and revisions such as\n\n    [Windows](https://devblogs.microsoft.com/bharry/the-largest-git-repo-on-the-planet/)\n\n    code base.\n\n\n    ## A brief history of large files\n\n\n    \"What about Git LFS?\" you may ask. Doesn't LFS stand for \"large file\n    storage\"?\n\n\n    Previously, extra tools were required to store large files in Git. In 2010,\n\n    [git-annex](https://git-annex.branchable.com/) was released, and five years\n\n    later in 2015, [Git LFS](https://git-lfs.github.com/) was released. Both\n\n    git-annex and Git LFS added large file support to Git in a similar way:\n    Instead\n\n    of storing a large file in Git, store a pointer file that links to the large\n\n    file. Then, when someone needs a large file, they can download it on-demand\n\n    using the pointer.\n\n\n    The criticism of this approach is that there are now two places to store\n    files,\n\n    in Git or in Git LFS. Which means that everyone must remember that big files\n    need\n\n    to go in Git LFS to keep the Git repo small and fast. There are downsides to\n\n    this approach. Besides being susceptible to human error, the pointer encodes\n\n    decisions based on bandwidth and file type into the structure of the\n    repository\n\n    that influence all the people using the repository. Our assumptions about\n\n    bandwidth and storage are likely to change over time, and vary by the\n    location,\n\n    but decisions encoded in the repository are not flexible. Administrators and\n\n    developers alike benefit from flexibility in where to store large files, and\n\n    which files to download.\n\n\n    Partial Clone solves these problems by removing the need for two classes of\n\n    storage, and special pointers. Let's walk through an example to understand\n    how.\n\n\n    ## How to get started with Partial Clone\n\n\n    Let's continue to use `gitlab-com/www-gitlab-com` as an example project,\n    since\n\n    it has quite a lot of images. For a larger repository, like a video game\n    with\n\n    detailed textures and models that could take up a lot of disk space, the\n    benefits will be even more significant.\n\n\n    Instead of a vanilla `git clone`, we will include a filter spec which\n    controls\n\n    what is excluded when fetching data. In this situation, we just want to\n    exclude\n\n    large binary files. I've included `--no-checkout` so we can more clearly\n    observe\n\n    what is happening.\n\n\n    ```bash\n\n    git clone --filter=blob:none --no-checkout\n    git@gitlab.com/gitlab-com/www-gitlab-com.git\n\n    # Cloning into 'www-gitlab-com'...\n\n    # remote: Enumerating objects: 624541, done.\n\n    # remote: Counting objects: 100% (624541/624541), done.\n\n    # remote: Compressing objects: 100% (151886/151886), done.\n\n    # remote: Total 624541 (delta 432983), reused 622339 (delta 430843),\n    pack-reused 0\n\n    # Receiving objects: 100% (624541/624541), 74.61 MiB | 8.14 MiB/s, done.\n\n    # Resolving deltas: 100% (432983/432983), done.\n\n    ```\n\n\n    Above we explicitly told Git not to checkout the default branch. Normally\n\n    `checkout` doesn't require fetching any data from the server, because we\n    have\n\n    everything locally. When using Partial Clone, since we are deliberately not\n    downloading everything, Git will need to fetch any missing files when doing\n    a\n\n    checkout.\n\n\n    ```bash\n\n    git checkout master\n\n    # remote: Enumerating objects: 12080, done.\n\n    # remote: Counting objects: 100% (12080/12080), done.\n\n    # remote: Compressing objects: 100% (11640/11640), done.\n\n    # remote: Total 12080 (delta 442), reused 9773 (delta 409), pack-reused 0\n\n    # Receiving objects: 100% (12080/12080), 1.10 GiB | 8.49 MiB/s, done.\n\n    # Resolving deltas: 100% (442/442), done.\n\n    # Updating files: 100% (12342/12342), done.\n\n    # Filtering content: 100% (3/3), 131.24 MiB | 4.73 MiB/s, done.\n\n    ```\n\n\n    If we checkout a different branch or commit, we'll need to download more\n    missing\n\n    files.\n\n\n    ```bash\n\n    git checkout 92d1f39b60f957d0bc3c5621bb3e17a3984bdf72\n\n    # remote: Enumerating objects: 1968, done.\n\n    # remote: Counting objects: 100% (1968/1968), done.\n\n    # remote: Compressing objects: 100% (1953/1953), done.\n\n    # remote: Total 1968 (delta 23), reused 1623 (delta 15), pack-reused 0\n\n    # Receiving objects: 100% (1968/1968), 327.44 MiB | 8.83 MiB/s, done.\n\n    # Resolving deltas: 100% (23/23), done.\n\n    # Updating files: 100% (2255/2255), done.\n\n    # Note: switching to '92d1f39b60f957d0bc3c5621bb3e17a3984bdf72'.\n\n    ```\n\n\n    Git remembers the filter spec we provided when cloning the repository so\n    that\n\n    fetching updates will also exclude large files until we need them.\n\n\n    ```bash\n\n    git config remote.origin.promisor\n\n    # true\n\n\n    git config remote.origin.partialclonefilter\n\n    # blob:none\n\n    ```\n\n\n    When committing changes, you simply commit binary files like you would any\n    other\n\n    file. There is no extra tool to install or configure, no need to treat big\n    files\n\n    differently to small files.\n\n\n    ## Network and Storage\n\n\n    If you are already using [Git LFS](https://git-lfs.github.com/) today, you\n    might\n\n    be aware that large files are stored and transferred differently to regular\n    Git\n\n    objects. On GitLab.com, Git LFS objects are stored in object storage (like\n    AWS\n\n    S3) rather than fast attached storage (like SSD), and transferred over HTTP\n    even\n\n    when using SSH for regular Git objects. Using object storage has the\n    advantage\n\n    of reducing storage costs for large binary files, while using simpler HTTP\n\n    requests for large downloads allows the possibility of resumable and\n    parallel\n\n    downloads.\n\n\n    Partial Clone\n\n    [already](https://public-inbox.org/git/20190625134039.21707-1-chriscool@tuxfamily.org/)\n\n    supports more than one remote, and work is underway to allow large files to\n    be\n\n    stored in a different location such as object storage. Unlike Git LFS,\n    however,\n\n    the repository or instance administrator will be able to choose which\n    objects\n\n    should be stored where, and change this configuration over time if needed.\n\n\n    Follow the epic for [improved large file\n\n    storage](https://gitlab.com/groups/gitlab-org/-/epics/1487) to learn more\n    and\n\n    follow our progress.\n\n\n    ## Performance\n\n\n    When fetching new objects from the Git server using a [filter\n\n    spec](https://github.com/git/git/blob/v2.25.0/Documentation/rev-list-options.txt#L735)\n     to exclude objects from the response, Git will check each object and exclude\n     any that match the filter spec. In [Git\n     2.25](https://raw.githubusercontent.com/git/git/master/Documentation/RelNotes/2.25.0.txt),\n     the most recent version, filtering has not been optimized for performance.\n\n    [Jeff King (Peff)](https://github.com/peff/) (GitHub) recently\n\n    [contributed](https://public-inbox.org/git/20200214182147.GA654525@coredump.intra.peff.net/)\n\n    performance improvements for blob size filtering, which will likely be\n    included\n\n    in [Git 2.26](https://gitlab.com/gitlab-org/gitaly/issues/2497), and our\n    plan is\n\n    to include it in GitLab 12.10 release.\n\n\n    Optimizing the sparse filter spec option (`--filter:sparse`), which filters\n\n    based on file path is more complex because blobs, which contain the file\n\n    content, do not include file path information. The directory structure of a\n\n    repository is stored in tree objects.\n\n\n    Follow the epic for [Partial Clone performance\n\n    improvements](https://gitlab.com/groups/gitlab-org/-/epics/1671) to learn\n    more\n\n    and follow our progress.\n\n\n    ## Usability\n\n\n    One of the drawbacks of Git LFS was that it required installing an\n    additional\n\n    tool. In comparison, Partial Clone does not require any additional tools.\n\n    However, it does require learning new options and configurations, such as to\n\n    clone using the `--filter` option.\n\n\n    We want to make it easy for people get their work done, who simply desire\n    Git to\n\n    just work. They shouldn't need to work out which is the optimal blob size\n    filter\n\n    spec for a project? Or what even is a filter spec?  While Partial Clone\n    remains\n\n    experimental, we haven't made any changes to the GitLab interface to\n    highlight\n\n    Partial Clone, but we are investigating this and welcome your feedback.\n    Please\n\n    join the conversation on this\n\n    [issue](https://gitlab.com/gitlab-org/gitlab/issues/207744).\n\n\n    ## File locking and tool integrations\n\n\n    Any conversation of large binary files, particularly in regards to video\n\n    games is incomplete without discussing file locking and tooling\n    integrations.\n\n\n    Unlike plain text [source code](/solutions/source-code-management/),\n    resolving conflicts between different versions of\n\n    a binary file is often impossible. To prevent conflicts in binary file\n    editing,\n\n    an exclusive file lock is used, meaning only one person at a time can edit a\n\n    single file, regardless of branches. If conflicts can't be resolved,\n    allowing multiple\n\n    versions of a individual file to be created in parallel on different\n    branches is a bug, not\n\n    a feature. GitLab already has basic file locking support, but it is really\n    only\n\n    useful for plain text because it only applies to the default branch, and is\n    not\n\n    integrated with any local tools.\n\n\n    Local tooling integrations are important for binary asset workflows, to\n\n    automatically propagate file locks to the local development environment, and\n    to\n\n    allow artists to work on assets without needing to use Git from the command\n\n    line. Propagating file locks quickly to local development environments is\n    also\n\n    important because it prevents work from being wasted before it even happens.\n\n\n    Follow the [file locking](https://gitlab.com/groups/gitlab-org/-/epics/1488)\n    and\n\n    [integrations](https://gitlab.com/groups/gitlab-org/-/epics/2704) epics for\n    more\n\n    information about what we're working on.\n\n\n    ## Conclusion\n\n\n    Large files are necessary for many projects, and Git will soon support this\n\n    natively, without the need for extra tools. Although Partial Clone is still\n    an\n\n    experimental feature, we are making improvements with every release and the\n\n    feature is now ready for testing.\n\n\n    Thank you to the Git community for your work over the past years on\n    improving\n\n    support for enormous repositories. Particularly, thank you to [Jeff\n\n    King](https://github.com/peff/) (GitHub) and [Christian\n\n    Couder](https://about.gitlab.com/company/team/#chriscool) (senior backend\n\n    engineer on Gitaly at GitLab) for your early experimentation with Partial\n    Clone,\n\n    Jonathan Tan (Google) and [Jeff Hostetler](https://github.com/jeffhostetler)\n\n    (Microsoft) for contributing the [first\n\n    implementation](https://public-inbox.org/git/cover.1506714999.git.jonathantanmy@google.com/)\n\n    of Partial Clone and promisor remotes, and the many others who've also\n\n    contributed.\n\n\n    If you are already using Partial Clone, or would like to help us test\n    Partial\n\n    Clone on a large project, please get in touch with me, [James\n\n    Ramsay](https://about.gitlab.com/company/team/#jramsay) (group manager,\n    product\n\n    for Create at GitLab), [Jordi\n\n    Mon](https://about.gitlab.com/company/team/#jordi_mon) (senior product\n    marketing\n\n    manager for Dev at GitLab), or your account manager.\n\n\n    For more information on Partial Clone, check out [the\n    documentation](https://docs.gitlab.com/topics/git/partial_clone/).\n\n\n    Cover image by [Simon Boxus](https://unsplash.com/@simonlerouge) on\n\n    [Unsplash](https://unsplash.com/photos/4ftI4lCcByM)\n\n  category: open-source\n  tags:\n    - git\n    - performance\nconfig:\n  slug: partial-clone-for-massive-repositories\n  featured: false\n  template: BlogPost\n",{"title":5,"description":17,"ogTitle":5,"ogDescription":17,"noIndex":14,"ogImage":19,"ogUrl":31,"ogSiteName":32,"ogType":33,"canonicalUrls":31},"https://about.gitlab.com/blog/partial-clone-for-massive-repositories","https://about.gitlab.com","article","en-us/blog/partial-clone-for-massive-repositories",[22,23],[22,23],"7lnfOe6yMVXXzYeJ44iWr81KY-dWm4wSG3k5m11gNWk",{"data":39},{"logo":40,"freeTrial":45,"sales":50,"login":55,"items":60,"search":369,"minimal":400,"duo":419,"switchNav":428,"pricingDeployment":439},{"config":41},{"href":42,"dataGaName":43,"dataGaLocation":44},"/","gitlab logo","header",{"text":46,"config":47},"Get free trial",{"href":48,"dataGaName":49,"dataGaLocation":44},"https://gitlab.com/-/trial_registrations/new?glm_source=about.gitlab.com&glm_content=default-saas-trial/","free trial",{"text":51,"config":52},"Talk to sales",{"href":53,"dataGaName":54,"dataGaLocation":44},"/sales/","sales",{"text":56,"config":57},"Sign in",{"href":58,"dataGaName":59,"dataGaLocation":44},"https://gitlab.com/users/sign_in/","sign in",[61,88,183,188,290,350],{"text":62,"config":63,"cards":65},"Platform",{"dataNavLevelOne":64},"platform",[66,72,80],{"title":62,"description":67,"link":68},"The intelligent orchestration platform for DevSecOps",{"text":69,"config":70},"Explore our Platform",{"href":71,"dataGaName":64,"dataGaLocation":44},"/platform/",{"title":73,"description":74,"link":75},"GitLab Duo Agent Platform","Agentic AI for the entire software lifecycle",{"text":76,"config":77},"Meet GitLab Duo",{"href":78,"dataGaName":79,"dataGaLocation":44},"/gitlab-duo-agent-platform/","gitlab duo agent platform",{"title":81,"description":82,"link":83},"Why GitLab","See the top reasons enterprises choose GitLab",{"text":84,"config":85},"Learn more",{"href":86,"dataGaName":87,"dataGaLocation":44},"/why-gitlab/","why gitlab",{"text":89,"left":27,"config":90,"link":92,"lists":96,"footer":165},"Product",{"dataNavLevelOne":91},"solutions",{"text":93,"config":94},"View all Solutions",{"href":95,"dataGaName":91,"dataGaLocation":44},"/solutions/",[97,121,144],{"title":98,"description":99,"link":100,"items":105},"Automation","CI/CD and automation to accelerate deployment",{"config":101},{"icon":102,"href":103,"dataGaName":104,"dataGaLocation":44},"AutomatedCodeAlt","/solutions/delivery-automation/","automated software delivery",[106,110,113,117],{"text":107,"config":108},"CI/CD",{"href":109,"dataGaLocation":44,"dataGaName":107},"/solutions/continuous-integration/",{"text":73,"config":111},{"href":78,"dataGaLocation":44,"dataGaName":112},"gitlab duo agent platform - product menu",{"text":114,"config":115},"Source Code Management",{"href":116,"dataGaLocation":44,"dataGaName":114},"/solutions/source-code-management/",{"text":118,"config":119},"Automated Software Delivery",{"href":103,"dataGaLocation":44,"dataGaName":120},"Automated software delivery",{"title":122,"description":123,"link":124,"items":129},"Security","Deliver code faster without compromising security",{"config":125},{"href":126,"dataGaName":127,"dataGaLocation":44,"icon":128},"/solutions/application-security-testing/","security and compliance","ShieldCheckLight",[130,134,139],{"text":131,"config":132},"Application Security Testing",{"href":126,"dataGaName":133,"dataGaLocation":44},"Application security testing",{"text":135,"config":136},"Software Supply Chain Security",{"href":137,"dataGaLocation":44,"dataGaName":138},"/solutions/supply-chain/","Software supply chain security",{"text":140,"config":141},"Software Compliance",{"href":142,"dataGaName":143,"dataGaLocation":44},"/solutions/software-compliance/","software compliance",{"title":145,"link":146,"items":151},"Measurement",{"config":147},{"icon":148,"href":149,"dataGaName":150,"dataGaLocation":44},"DigitalTransformation","/solutions/visibility-measurement/","visibility and measurement",[152,156,160],{"text":153,"config":154},"Visibility & Measurement",{"href":149,"dataGaLocation":44,"dataGaName":155},"Visibility and Measurement",{"text":157,"config":158},"Value Stream Management",{"href":159,"dataGaLocation":44,"dataGaName":157},"/solutions/value-stream-management/",{"text":161,"config":162},"Analytics & Insights",{"href":163,"dataGaLocation":44,"dataGaName":164},"/solutions/analytics-and-insights/","Analytics and insights",{"title":166,"items":167},"GitLab for",[168,173,178],{"text":169,"config":170},"Enterprise",{"href":171,"dataGaLocation":44,"dataGaName":172},"/enterprise/","enterprise",{"text":174,"config":175},"Small Business",{"href":176,"dataGaLocation":44,"dataGaName":177},"/small-business/","small business",{"text":179,"config":180},"Public Sector",{"href":181,"dataGaLocation":44,"dataGaName":182},"/solutions/public-sector/","public sector",{"text":184,"config":185},"Pricing",{"href":186,"dataGaName":187,"dataGaLocation":44,"dataNavLevelOne":187},"/pricing/","pricing",{"text":189,"config":190,"link":192,"lists":196,"feature":281},"Resources",{"dataNavLevelOne":191},"resources",{"text":193,"config":194},"View all resources",{"href":195,"dataGaName":191,"dataGaLocation":44},"/resources/",[197,230,253],{"title":198,"items":199},"Getting started",[200,205,210,215,220,225],{"text":201,"config":202},"Install",{"href":203,"dataGaName":204,"dataGaLocation":44},"/install/","install",{"text":206,"config":207},"Quick start guides",{"href":208,"dataGaName":209,"dataGaLocation":44},"/get-started/","quick setup checklists",{"text":211,"config":212},"Learn",{"href":213,"dataGaLocation":44,"dataGaName":214},"https://university.gitlab.com/","learn",{"text":216,"config":217},"Product documentation",{"href":218,"dataGaName":219,"dataGaLocation":44},"https://docs.gitlab.com/","product documentation",{"text":221,"config":222},"Best practice videos",{"href":223,"dataGaName":224,"dataGaLocation":44},"/getting-started-videos/","best practice videos",{"text":226,"config":227},"Integrations",{"href":228,"dataGaName":229,"dataGaLocation":44},"/integrations/","integrations",{"title":231,"items":232},"Discover",[233,238,243,248],{"text":234,"config":235},"Customer success stories",{"href":236,"dataGaName":237,"dataGaLocation":44},"/customers/","customer success stories",{"text":239,"config":240},"Blog",{"href":241,"dataGaName":242,"dataGaLocation":44},"/blog/","blog",{"text":244,"config":245},"The Source",{"href":246,"dataGaName":247,"dataGaLocation":44},"/the-source/","the source",{"text":249,"config":250},"Remote",{"href":251,"dataGaName":252,"dataGaLocation":44},"https://handbook.gitlab.com/handbook/company/culture/all-remote/","remote",{"title":254,"items":255},"Connect",[256,261,266,271,276],{"text":257,"config":258},"GitLab Services",{"href":259,"dataGaName":260,"dataGaLocation":44},"/services/","services",{"text":262,"config":263},"Community",{"href":264,"dataGaName":265,"dataGaLocation":44},"/community/","community",{"text":267,"config":268},"Forum",{"href":269,"dataGaName":270,"dataGaLocation":44},"https://forum.gitlab.com/","forum",{"text":272,"config":273},"Events",{"href":274,"dataGaName":275,"dataGaLocation":44},"/events/","events",{"text":277,"config":278},"Partners",{"href":279,"dataGaName":280,"dataGaLocation":44},"/partners/","partners",{"textColor":282,"title":283,"text":284,"link":285},"#000","What’s new in GitLab","Stay updated with our latest features and improvements.",{"text":286,"config":287},"Read the latest",{"href":288,"dataGaName":289,"dataGaLocation":44},"/releases/whats-new/","whats new",{"text":291,"config":292,"lists":294},"Company",{"dataNavLevelOne":293},"company",[295],{"items":296},[297,302,308,310,315,320,325,330,335,340,345],{"text":298,"config":299},"About",{"href":300,"dataGaName":301,"dataGaLocation":44},"/company/","about",{"text":303,"config":304,"footerGa":307},"Jobs",{"href":305,"dataGaName":306,"dataGaLocation":44},"/jobs/","jobs",{"dataGaName":306},{"text":272,"config":309},{"href":274,"dataGaName":275,"dataGaLocation":44},{"text":311,"config":312},"Leadership",{"href":313,"dataGaName":314,"dataGaLocation":44},"/company/team/e-group/","leadership",{"text":316,"config":317},"Team",{"href":318,"dataGaName":319,"dataGaLocation":44},"/company/team/","team",{"text":321,"config":322},"Handbook",{"href":323,"dataGaName":324,"dataGaLocation":44},"https://handbook.gitlab.com/","handbook",{"text":326,"config":327},"Investor relations",{"href":328,"dataGaName":329,"dataGaLocation":44},"https://ir.gitlab.com/","investor relations",{"text":331,"config":332},"Trust Center",{"href":333,"dataGaName":334,"dataGaLocation":44},"/security/","trust center",{"text":336,"config":337},"AI Transparency Center",{"href":338,"dataGaName":339,"dataGaLocation":44},"/ai-transparency-center/","ai transparency center",{"text":341,"config":342},"Newsletter",{"href":343,"dataGaName":344,"dataGaLocation":44},"/company/contact/#contact-forms","newsletter",{"text":346,"config":347},"Press",{"href":348,"dataGaName":349,"dataGaLocation":44},"/press/","press",{"text":351,"config":352,"lists":353},"Contact us",{"dataNavLevelOne":293},[354],{"items":355},[356,359,364],{"text":51,"config":357},{"href":53,"dataGaName":358,"dataGaLocation":44},"talk to sales",{"text":360,"config":361},"Support portal",{"href":362,"dataGaName":363,"dataGaLocation":44},"https://support.gitlab.com","support portal",{"text":365,"config":366},"Customer portal",{"href":367,"dataGaName":368,"dataGaLocation":44},"https://customers.gitlab.com/customers/sign_in/","customer portal",{"close":370,"login":371,"suggestions":378},"Close",{"text":372,"link":373},"To search repositories and projects, login to",{"text":374,"config":375},"gitlab.com",{"href":58,"dataGaName":376,"dataGaLocation":377},"search login","search",{"text":379,"default":380},"Suggestions",[381,383,387,389,393,397],{"text":73,"config":382},{"href":78,"dataGaName":73,"dataGaLocation":377},{"text":384,"config":385},"Code Suggestions (AI)",{"href":386,"dataGaName":384,"dataGaLocation":377},"/solutions/code-suggestions/",{"text":107,"config":388},{"href":109,"dataGaName":107,"dataGaLocation":377},{"text":390,"config":391},"GitLab on AWS",{"href":392,"dataGaName":390,"dataGaLocation":377},"/partners/technology-partners/aws/",{"text":394,"config":395},"GitLab on Google Cloud",{"href":396,"dataGaName":394,"dataGaLocation":377},"/partners/technology-partners/google-cloud-platform/",{"text":398,"config":399},"Why GitLab?",{"href":86,"dataGaName":398,"dataGaLocation":377},{"freeTrial":401,"mobileIcon":406,"desktopIcon":411,"secondaryButton":414},{"text":402,"config":403},"Start free trial",{"href":404,"dataGaName":49,"dataGaLocation":405},"https://gitlab.com/-/trials/new/","nav",{"altText":407,"config":408},"Gitlab Icon",{"src":409,"dataGaName":410,"dataGaLocation":405},"https://res.cloudinary.com/about-gitlab-com/image/upload/v1758203874/jypbw1jx72aexsoohd7x.svg","gitlab icon",{"altText":407,"config":412},{"src":413,"dataGaName":410,"dataGaLocation":405},"https://res.cloudinary.com/about-gitlab-com/image/upload/v1758203875/gs4c8p8opsgvflgkswz9.svg",{"text":415,"config":416},"Get Started",{"href":417,"dataGaName":418,"dataGaLocation":405},"https://gitlab.com/-/trial_registrations/new?glm_source=about.gitlab.com/get-started/","get started",{"freeTrial":420,"mobileIcon":424,"desktopIcon":426},{"text":421,"config":422},"Learn more about GitLab Duo",{"href":78,"dataGaName":423,"dataGaLocation":405},"gitlab duo",{"altText":407,"config":425},{"src":409,"dataGaName":410,"dataGaLocation":405},{"altText":407,"config":427},{"src":413,"dataGaName":410,"dataGaLocation":405},{"button":429,"mobileIcon":434,"desktopIcon":436},{"text":430,"config":431},"/switch",{"href":432,"dataGaName":433,"dataGaLocation":405},"#contact","switch",{"altText":407,"config":435},{"src":409,"dataGaName":410,"dataGaLocation":405},{"altText":407,"config":437},{"src":438,"dataGaName":410,"dataGaLocation":405},"https://res.cloudinary.com/about-gitlab-com/image/upload/v1773335277/ohhpiuoxoldryzrnhfrh.png",{"freeTrial":440,"mobileIcon":445,"desktopIcon":447},{"text":441,"config":442},"Back to pricing",{"href":186,"dataGaName":443,"dataGaLocation":405,"icon":444},"back to pricing","GoBack",{"altText":407,"config":446},{"src":409,"dataGaName":410,"dataGaLocation":405},{"altText":407,"config":448},{"src":413,"dataGaName":410,"dataGaLocation":405},{"title":450,"button":451,"config":456},"See how agentic AI transforms software delivery",{"text":452,"config":453},"Watch GitLab Transcend now",{"href":454,"dataGaName":455,"dataGaLocation":44},"/events/transcend/virtual/","transcend event",{"layout":457,"icon":458,"disabled":27},"release","AiStar",{"data":460},{"text":461,"source":462,"edit":468,"contribute":473,"config":478,"items":483,"minimal":690},"Git is a trademark of Software Freedom Conservancy and our use of 'GitLab' is under license",{"text":463,"config":464},"View page source",{"href":465,"dataGaName":466,"dataGaLocation":467},"https://gitlab.com/gitlab-com/marketing/digital-experience/about-gitlab-com/","page source","footer",{"text":469,"config":470},"Edit this page",{"href":471,"dataGaName":472,"dataGaLocation":467},"https://gitlab.com/gitlab-com/marketing/digital-experience/about-gitlab-com/-/blob/main/content/","web ide",{"text":474,"config":475},"Please contribute",{"href":476,"dataGaName":477,"dataGaLocation":467},"https://gitlab.com/gitlab-com/marketing/digital-experience/about-gitlab-com/-/blob/main/CONTRIBUTING.md/","please contribute",{"twitter":479,"facebook":480,"youtube":481,"linkedin":482},"https://twitter.com/gitlab","https://www.facebook.com/gitlab","https://www.youtube.com/channel/UCnMGQ8QHMAnVIsI3xJrihhg","https://www.linkedin.com/company/gitlab-com",[484,531,585,629,656],{"title":184,"links":485,"subMenu":500},[486,490,495],{"text":487,"config":488},"View plans",{"href":186,"dataGaName":489,"dataGaLocation":467},"view plans",{"text":491,"config":492},"Why Premium?",{"href":493,"dataGaName":494,"dataGaLocation":467},"/pricing/premium/","why premium",{"text":496,"config":497},"Why Ultimate?",{"href":498,"dataGaName":499,"dataGaLocation":467},"/pricing/ultimate/","why ultimate",[501],{"title":502,"links":503},"Contact Us",[504,507,509,511,516,521,526],{"text":505,"config":506},"Contact sales",{"href":53,"dataGaName":54,"dataGaLocation":467},{"text":360,"config":508},{"href":362,"dataGaName":363,"dataGaLocation":467},{"text":365,"config":510},{"href":367,"dataGaName":368,"dataGaLocation":467},{"text":512,"config":513},"Status",{"href":514,"dataGaName":515,"dataGaLocation":467},"https://status.gitlab.com/","status",{"text":517,"config":518},"Terms of use",{"href":519,"dataGaName":520,"dataGaLocation":467},"/terms/","terms of use",{"text":522,"config":523},"Privacy statement",{"href":524,"dataGaName":525,"dataGaLocation":467},"/privacy/","privacy statement",{"text":527,"config":528},"Cookie preferences",{"dataGaName":529,"dataGaLocation":467,"id":530,"isOneTrustButton":27},"cookie preferences","ot-sdk-btn",{"title":89,"links":532,"subMenu":541},[533,537],{"text":534,"config":535},"DevSecOps platform",{"href":71,"dataGaName":536,"dataGaLocation":467},"devsecops platform",{"text":538,"config":539},"AI-Assisted Development",{"href":78,"dataGaName":540,"dataGaLocation":467},"ai-assisted development",[542],{"title":543,"links":544},"Topics",[545,550,555,560,565,570,575,580],{"text":546,"config":547},"CICD",{"href":548,"dataGaName":549,"dataGaLocation":467},"/topics/ci-cd/","cicd",{"text":551,"config":552},"GitOps",{"href":553,"dataGaName":554,"dataGaLocation":467},"/topics/gitops/","gitops",{"text":556,"config":557},"DevOps",{"href":558,"dataGaName":559,"dataGaLocation":467},"/topics/devops/","devops",{"text":561,"config":562},"Version Control",{"href":563,"dataGaName":564,"dataGaLocation":467},"/topics/version-control/","version control",{"text":566,"config":567},"DevSecOps",{"href":568,"dataGaName":569,"dataGaLocation":467},"/topics/devsecops/","devsecops",{"text":571,"config":572},"Cloud Native",{"href":573,"dataGaName":574,"dataGaLocation":467},"/topics/cloud-native/","cloud native",{"text":576,"config":577},"AI for Coding",{"href":578,"dataGaName":579,"dataGaLocation":467},"/topics/devops/ai-for-coding/","ai for coding",{"text":581,"config":582},"Agentic AI",{"href":583,"dataGaName":584,"dataGaLocation":467},"/topics/agentic-ai/","agentic ai",{"title":586,"links":587},"Solutions",[588,590,592,597,601,604,608,611,613,616,619,624],{"text":131,"config":589},{"href":126,"dataGaName":131,"dataGaLocation":467},{"text":120,"config":591},{"href":103,"dataGaName":104,"dataGaLocation":467},{"text":593,"config":594},"Agile development",{"href":595,"dataGaName":596,"dataGaLocation":467},"/solutions/agile-delivery/","agile delivery",{"text":598,"config":599},"SCM",{"href":116,"dataGaName":600,"dataGaLocation":467},"source code management",{"text":546,"config":602},{"href":109,"dataGaName":603,"dataGaLocation":467},"continuous integration & delivery",{"text":605,"config":606},"Value stream management",{"href":159,"dataGaName":607,"dataGaLocation":467},"value stream management",{"text":551,"config":609},{"href":610,"dataGaName":554,"dataGaLocation":467},"/solutions/gitops/",{"text":169,"config":612},{"href":171,"dataGaName":172,"dataGaLocation":467},{"text":614,"config":615},"Small business",{"href":176,"dataGaName":177,"dataGaLocation":467},{"text":617,"config":618},"Public sector",{"href":181,"dataGaName":182,"dataGaLocation":467},{"text":620,"config":621},"Education",{"href":622,"dataGaName":623,"dataGaLocation":467},"/solutions/education/","education",{"text":625,"config":626},"Financial services",{"href":627,"dataGaName":628,"dataGaLocation":467},"/solutions/finance/","financial services",{"title":189,"links":630},[631,633,635,637,640,642,644,646,648,650,652,654],{"text":201,"config":632},{"href":203,"dataGaName":204,"dataGaLocation":467},{"text":206,"config":634},{"href":208,"dataGaName":209,"dataGaLocation":467},{"text":211,"config":636},{"href":213,"dataGaName":214,"dataGaLocation":467},{"text":216,"config":638},{"href":218,"dataGaName":639,"dataGaLocation":467},"docs",{"text":239,"config":641},{"href":241,"dataGaName":242,"dataGaLocation":467},{"text":234,"config":643},{"href":236,"dataGaName":237,"dataGaLocation":467},{"text":249,"config":645},{"href":251,"dataGaName":252,"dataGaLocation":467},{"text":257,"config":647},{"href":259,"dataGaName":260,"dataGaLocation":467},{"text":262,"config":649},{"href":264,"dataGaName":265,"dataGaLocation":467},{"text":267,"config":651},{"href":269,"dataGaName":270,"dataGaLocation":467},{"text":272,"config":653},{"href":274,"dataGaName":275,"dataGaLocation":467},{"text":277,"config":655},{"href":279,"dataGaName":280,"dataGaLocation":467},{"title":291,"links":657},[658,660,662,664,666,668,670,674,679,681,683,685],{"text":298,"config":659},{"href":300,"dataGaName":293,"dataGaLocation":467},{"text":303,"config":661},{"href":305,"dataGaName":306,"dataGaLocation":467},{"text":311,"config":663},{"href":313,"dataGaName":314,"dataGaLocation":467},{"text":316,"config":665},{"href":318,"dataGaName":319,"dataGaLocation":467},{"text":321,"config":667},{"href":323,"dataGaName":324,"dataGaLocation":467},{"text":326,"config":669},{"href":328,"dataGaName":329,"dataGaLocation":467},{"text":671,"config":672},"Sustainability",{"href":673,"dataGaName":671,"dataGaLocation":467},"/sustainability/",{"text":675,"config":676},"Diversity, inclusion and belonging (DIB)",{"href":677,"dataGaName":678,"dataGaLocation":467},"/diversity-inclusion-belonging/","Diversity, inclusion and belonging",{"text":331,"config":680},{"href":333,"dataGaName":334,"dataGaLocation":467},{"text":341,"config":682},{"href":343,"dataGaName":344,"dataGaLocation":467},{"text":346,"config":684},{"href":348,"dataGaName":349,"dataGaLocation":467},{"text":686,"config":687},"Modern Slavery Transparency Statement",{"href":688,"dataGaName":689,"dataGaLocation":467},"https://handbook.gitlab.com/handbook/legal/modern-slavery-act-transparency-statement/","modern slavery transparency statement",{"items":691},[692,695,698],{"text":693,"config":694},"Terms",{"href":519,"dataGaName":520,"dataGaLocation":467},{"text":696,"config":697},"Cookies",{"dataGaName":529,"dataGaLocation":467,"id":530,"isOneTrustButton":27},{"text":699,"config":700},"Privacy",{"href":524,"dataGaName":525,"dataGaLocation":467},[702],{"id":703,"title":9,"body":25,"config":704,"content":706,"description":25,"extension":24,"meta":710,"navigation":27,"path":711,"seo":712,"stem":713,"__hash__":714},"blogAuthors/en-us/blog/authors/james-ramsay.yml",{"template":705},"BlogAuthor",{"name":9,"config":707},{"headshot":708,"ctfId":709},"","jramsay",{},"/en-us/blog/authors/james-ramsay",{},"en-us/blog/authors/james-ramsay","iKU7kqEnGoklxsvYnqw2L7oAOICkVwJOVHAkOv_GdM4",[716,729,742],{"content":717,"config":727},{"title":718,"description":719,"authors":720,"heroImage":722,"date":723,"body":724,"category":11,"tags":725},"GitLab AI Hackathon 2026: Meet the winners","Nearly 7,000 developers built 600+ AI agents and flows on GitLab Duo Agent Platform. Find out who won and what they created.",[721],"Nick Veenhof","https://res.cloudinary.com/about-gitlab-com/image/upload/v1776457632/llddiylsgwuze0u1rjks.png","2026-04-22","AI writes code. That is expected now. But planning, security, compliance, and deployments? Those gaps remain. I have run contributor programs for years. I have never seen a community respond to technology like this.\n\nThat is why we opened [GitLab Duo Agent Platform](https://about.gitlab.com/gitlab-duo-agent-platform/) and invited developers worldwide to build AI agents that help teams ship secure software faster. Not chatbots that answer questions, but agents that jump into workflows, respond to events, and act on your behalf. The GitLab AI Hackathon ran from February 9 to March 25, 2026, on Devpost, the hackathon platform. Google Cloud and Anthropic joined as co-sponsors.\n\nWhen my team planned this hackathon with Google Cloud and Anthropic, I asked the judges to score four things: technical work, design, potential impact, and idea quality. We hoped for strong turnout. What we got surprised all of us. Nineteen judges spent 18 days reviewing every entry. Google Cloud and Anthropic provided judges, prizes, and cloud access. The community built hundreds of agents and flows because they wanted to solve these problems.\n\nNearly 7,000 developers showed up. They built 600+ agents and flows in weeks. The prizes across all categories totaled $65,000 from GitLab, Google Cloud, and Anthropic.\n\n\nIf you have ever watched a senior engineer leave and take half the team's knowledge with them, you know why the winning project hit so hard.\n\nRead on to find out what the community built.\n\n## Grand Prize: LORE\n\n[LORE](https://devpost.com/software/lore-living-organizational-record-engine), the Living Organizational Record Engine, uses eight agents with a router that sends each question to the right agent, logic to prevent circular loops in the knowledge graph, a visual dashboard, and carbon tracking. The command-line tool ships with 43 tests (yes, 43 tests in a hackathon project).\n\nLORE solves a real problem: the knowledge that lives in engineers' heads and walks out the door when they leave. In my experience, a hackathon project with 43 tests is rare. That many tests in a hackathon project tells you something about the team behind it.\n\nJudge April Guo (Anthropic) wrote: \"This feels like a product, not a hackathon project.\"\n\n\n### Google Cloud winners\n\n[Gitdefender](https://devpost.com/software/gitdefender) won the Google Cloud Grand Prize. It works inside code review workflows, finding and fixing security issues. It spots the bug, writes the fix, and opens the code review. No developer needs to step in.\n\n[Aegis](https://devpost.com/software/aegis-2m1oq0) won the Google Cloud Runner Up. It gives AI-powered explanations for every decision it makes, deployed to Google Cloud and ready for production use.\n\n### Anthropic winners\n\n[GraphDev](https://devpost.com/software/graphdev) won the Anthropic Grand Prize. It maps code links and shows how systems change over time. Judge Aboobacker MK (GitLab) noted it was \"in sync with our work on GitLab knowledge graph.\" Judge Ayush Billore (GitLab) wrote: \"Loved the demo and UX, super useful for understanding how the system evolved and what gets impacted by changes.\" You can see the full impact of a change before you make it.\n\n[DocSync](https://devpost.com/software/pipeheal) won the Anthropic Runner Up. It uses three agents: Detector, Writer, and Reviewer. If DocSync is confident in the fix, it opens a code review. If not, it creates an issue for a human to check.\n\n## Category winners\n\n### Most Technically Impressive\n\nDatabase migrations break things. [Time-Traveler](https://devpost.com/software/time-traveler-w3cxp0) creates a safe copy of your production setup, runs the migration against that copy, and reports the result. It runs five agents connected by a bridge, with real Google Cloud deployment, real PostgreSQL migrations, and real data.\n\n### Most Impactful\n\n[RedAgent](https://devpost.com/software/redagent) checks AI-generated security reports, closing the trust gap between AI findings and developer action. If your team uses AI for security scanning, you know this problem. I have seen teams dismiss AI findings because they could not verify them. RedAgent gives teams a way to check AI output before it reaches developers.\n\n### Easiest to Use\n\n[Launch Control](https://devpost.com/software/launch-control-bgp8az) delivers polished UX and solid infrastructure, and scored well on sustainability too.\n\n## The sustainability signal\n\nFive projects won prizes or bonuses for environmental impact. Software delivery has a carbon cost as CI/CD pipelines, but now LLMs also run compute at scale. We created the Green Agent category to challenge developers to measure and reduce that footprint. Stacy Cline and Kim Buncle from GitLab's sustainability team helped judge the Green Agent category. \n\n### Green Agent prize\n\n[GreenPipe](https://devpost.com/software/greenpipe) scans CI/CD pipelines for environmental impact and produces carbon footprint reports. Judges Kim Buncle and Rajesh Agadi (Google) both backed the project.\n\n### Sustainable Design bonus\n\nSustainable Design bonuses were awarded to the projects with exceptional sustainability practices in their design, from model optimization techniques to energy-efficient architecture choices.\n\n* [BugFlow](https://devpost.com/software/bugflow-ai-regression-detective-ci-optimizer) turned one bug report into 10 fixes in 20 minutes. \n* [DELTA Cyber Reasoning](https://devpost.com/software/delta-cyber-reasoning-system) is automated fuzz testing for security. \n* [CarbonLint](https://devpost.com/software/carbonlint) applied code analysis to energy use.\n* [TFGuardian](https://devpost.com/software/tfguardian) features a carbon footprint analyzer, among other agents.\n\nCongratulations on all the Sustainable Design bonus winners! \n\nJudge Jens-Joris Decorte (TechWolf) cited the result: Costs dropped from $556 to $18 per month, a 96% carbon cut (that is a $538 monthly saving with a sustainability label on it).\n\n## Honorable mentions and the long tail\n\nSix projects received honorable mentions:\n\n\n- [SecurityMonkey](https://devpost.com/software/securitymonkey) injects known vulnerabilities into a test branch and scores how well your security scanners catch them.\n- [stregent](https://devpost.com/software/stregent) monitors CI/CD pipelines and lets developers investigate and merge fixes from WhatsApp without opening a laptop.\n- [Compliance Sentinel](https://devpost.com/software/compliance-sentinel-autonomous-devsecops-governance) scores every merge request for compliance risk and blocks the merge if critical violations are detected.\n- [Carbon Tracker](https://devpost.com/software/carbon-tracker-ij25kf) calculates the carbon footprint of each CI/CD pipeline job and posts optimization tips on the merge request.\n- [RepoWarden](https://devpost.com/software/docuguard) is the first Living Specification Engine, an AI system that captures why code was written, not just what it does.\n- [MR Compliance Auditor](https://devpost.com/software/mr-compliance-auditor) collects evidence across merge requests, maps it to SOC 2 controls, and streams compliance scores to a live dashboard.\n\nMy favorite quote from the judging came from Luca Chun Lun Lit (Anthropic), who described stregent's mobile-first approach: \"Being able to essentially code from your phone is a next level in the engineering experience.\"\n\n> Explore the 600+ entries in the [project gallery](https://gitlab.devpost.com/project-gallery).\n\n## What comes next\n\nEvery agent in this hackathon worked within a single project. They still delivered impressive results. Some participants ran a local knowledge graph alongside their agents to surface code relationships and dependencies within the repo. LORE captures project history. Gitdefender finds vulnerabilities. Pairing agents with richer local context is already helping contributors build sharper tools. The next hackathon will build on what contributors are already doing with richer context. Sign up on [contributors.gitlab.com](https://contributors.gitlab.com/) to be the first to know when details drop.\n\n\n## Get started\n\nA special thanks to Lee Tickett (GitLab) and Mattias Michaux (GitLab) for orchestrating the orchestrators and innovators behind this hackathon!\n\nThank you to every developer who submitted. Nearly 7,000 of you showed what GitLab Duo Agent Platform can do when a community decides to build. I am proud of what you built here, and I cannot wait to see what you build next.\n\nBuild your own agent on [GitLab Duo Agent Platform](https://docs.gitlab.com/user/duo_agent_platform/). Browse community-built agents in the [AI Catalog](https://docs.gitlab.com/user/duo_agent_platform/ai_catalog/). You orchestrate. AI accelerates.\n",[726,265],"AI/ML",{"featured":14,"template":15,"slug":728},"gitlab-ai-hackathon-2026-meet-the-winners",{"content":730,"config":740},{"title":731,"description":732,"authors":733,"heroImage":735,"date":736,"category":11,"tags":737,"body":739},"What’s new in Git 2.54.0?","Learn about release contributions, including new repository maintenance, a new command to edit commit history, a replacement for git-sizer(1), and more.",[734],"Patrick Steinhardt","https://res.cloudinary.com/about-gitlab-com/image/upload/v1776711651/sj7xxyyuimlarswbyft5.png","2026-04-20",[738,22,265],"open source","The Git project recently released [Git 2.54.0](https://lore.kernel.org/git/xmqqa4uxsjrs.fsf@gitster.g/T/#u). Let's look at a few notable highlights from this release, which includes contributions from the Git team at GitLab.\n\n## Pluggable Object Databases\n\nGit already has the ability to store references with either the \"files\" backend or with the [\"reftable\" backend](https://about.gitlab.com/blog/a-beginners-guide-to-the-git-reftable-format/). This is achieved by having proper abstractions in Git that allows us to have different backends.\n\nBut references are just one of the two important types of data that are stored in repositories, with the other being objects. Objects are stored in the object database, and each object database in turn consists of multiple object sources where objects can be read from or written to. Each object source either stores individual objects as so-called \"loose\" objects, or compresses multiple objects into a \"packfile\" in your `.git/objects` directory.\n\nUntil now, however, these sources did not have a proper abstraction boundary, so the storage format for objects is completely hardcoded into Git. But this is finally changing with pluggable object databases! The concept is straightforward and similar to how we did this for references in the past: Instead of having hardcoded code paths for how to store objects, we introduce an abstraction boundary that allows us to have different backends for storing objects.\n\nWhile the idea is simple, the implementation is not, as we have hardcoded assumptions about the storage formats used in Git all over the place. In fact, we have started working on this topic in Git 2.48, which was released in January 2025. Initially, we focused on making object-related subsystems self-contained and creating proper subsystems for the existing backends that we had in Git.\n\nWith Git 2.54, we have now reached a milestone: The object database backend is now pluggable. Not all of Git's functionality is covered yet, but introducing an alternate backend that handles a meaningful subset of operations is now a realistic undertaking.\n\nFor now, only local workflows like creating commits, showing commit graphs, or performing merges will work with such an alternative implementation. This notably excludes anything that interacts with a remote, such as when you want to fetch or push changes. Regardless, this is the culmination of almost two years of work spanning across almost 400 commits that have been merged upstream, and we will of course continue to iterate on this effort.\n\nSo why does this matter? The idea is that it becomes practical to introduce new storage formats into Git. Examples could be:\n- A storage format that is able to store large binary files more efficiently\n  than packfiles do today\n\n- A storage format that is custom-tailored for GitLab to ensure that we can\n  serve repositories to our users even more efficiently than we currently can\n\n\nThis is a large-scale effort that is likely to shape the future of Git and GitLab.\n\n*This project was led by [Patrick Steinhardt](https://gitlab.com/pks-gitlab).*\n\n## Easier editing of your commit history\n\nIn many software development projects it is common practice for developers to not only polish the code they want to contribute, but to also polish the commit history so that it becomes easy to review. The result is a set of small and atomic commits that each do one thing, with a good commit message that describes the intent of the commit as well as specific nuances.\n\nOf course, more often than not, these atomic commits are not something that just happens naturally during the development process. Instead, the author of the changes will gain a better understanding of what they are while iterating on them, and the way to split up the commits will become clearer over time. Furthermore, the subsequent review process may result in feedback that requires changes to the crafted commits.\n\nThe consequence of this process is that the developer will have to rewrite their commit history many times during the development process. Historically, Git has allowed for this use case via [interactive rebases](https://git-scm.com/docs/git-rebase#_interactive_mode). These interactive rebases are an extremely powerful tool: They let you reorder commits, rewrite commit messages, squash multiple commits together, or perform arbitrary edits of any commit.\n\nBut they are also somewhat arcane and hard to understand. The user needs to figure out the base commit for the rebase, they need to understand how to edit a somewhat obscure \"instruction sheet,\" and they need to be aware of how the stateful rebasing process works. For example, users are presented with an instruction sheet similar to the following when rebasing a topic branch:\n\n```shell\npick b60623f382 # t: detect errors outside of test cases # empty\npick b80cb55882 # t: prepare `test_match_signal ()` calls for `set -e`\npick 5ffe397f30 # t: prepare `test_must_fail ()` for `set -e`\npick 5e9b0cf5e1 # t: prepare `stop_git_daemon ()` for `set -e`\npick 299561e7a2 # t: prepare `git config --unset` calls for `set -e`\npick ed0e7ca2b5 # t: detect errors outside of test cases\n```\n\nSo while interactive rebases are powerful, they are also quite intimidating for the average user.\n\nIt doesn't have to be this way, though. Tools like [Jujutsu](https://www.jj-vcs.dev/latest/) provide interfaces that are much easier to use compared to Git, as you can for example simply execute `jj split` to split up a commit into two commits. With Git and interactive rebases, this use case requires a lot of different steps with confusing command line arguments.\n\nWe have thus taken inspiration from Jujutsu and have introduced a new git-history(1) command into Git that is the foundation for better history editing. For now, this command has two subcommands:\n\n- `git history reword` allows you to easily rewrite a commit message. You simply\n  give it the commit whose message you want to reword, Git asks you for the new\n  commit message, and that's it.\n\n- `git history split` allows you to split up a commit into two, which is\n  inspired by `jj split`. You give it a commit, Git asks you which changes to\n  stage into which commit and for the two commit messages, and then you're done.\n\n\nThis is of course only a start, and we want to add additional subcommands over time. For example:\n\n- `git history fixup` to take staged changes and automatically amend them to a\n  specific commit\n\n- `git history drop` to remove a commit\n- `git history reorder` to reorder the sequence of commits\n- `git history squash` to squash a range of commits\n\nBut that's not all! In addition to making history editing easy, this new command also knows to automatically rebase all of your local branches that previously included this commit. So that means that you can even edit a commit that is not on the current branch, and all branches that contain the commit will be rewritten.\n\nIt may seem puzzling at first that Git is automatically rebasing dependent branches, as that is a significant diversion from how git-rebase(1) works. But this is part of a bigger effort to bring better support for Stacked Diffs to Git, which are a way to create a series of multiple dependent branches that can be reviewed independently, but that together work towards a bigger goal.\n\n*This project was led by [Patrick Steinhardt](https://gitlab.com/pks-gitlab) with support from [Elijah Newren](https://github.com/newren).*\n\n## A native replacement for git-sizer(1)\n\nThe size of a Git repository is an important factor that determines how well Git and GitLab can handle it. But size alone is not the only factor, as the performance of a repository is ultimately a combination of multiple different dimensions:\n\n- The depth of the commit history\n- The shape of the directory structure\n- The size of files stored in the repository\n- The number of references\n\nThese are only some of the dimensions one needs to consider when trying to predict whether Git will be able to handle a repository well.\n\nBut while it is clear that the mere repository size is insufficient, Git itself does not provide any tooling that gives the user an easy overview of these metrics. Instead, users are forced to rely on third-party tools like [git-sizer(1)](https://github.com/github/git-sizer) to fill this gap. This tool does an excellent job at surfacing this information, but it is not part of Git itself and thus needs to be installed separately.\n\nObservability of repository internals is critical to us at GitLab, so we introduced a [new `git repo structure` command into Git 2.52](https://about.gitlab.com/blog/whats-new-in-git-2-52-0/#new-subcommand-for-git-repo1-to-display-repository-metrics) to display repository metrics, which we have extended in Git 2.53 to [show inflated and disk sizes for objects by type](https://about.gitlab.com/blog/whats-new-in-git-2-53-0/#more-data-collected-in-git-repo-structure).\n\nIn Git 2.54, we are now iterating some more on this command so that we don't only show the overall size, but also show the largest objects by type:\n\n```shell\n$ git clone https://gitlab.com/git-scm/git.git\n$ cd git\n$ git repo structure\nCounting objects: 410445, done.\n| Repository structure      | Value       |\n| ------------------------- | ----------- |\n| * References              |             |\n|   * Count                 |    1.01 k   |\n|     * Branches            |       1     |\n|     * Tags                |    1.00 k   |\n|     * Remotes             |       9     |\n|     * Others              |       0     |\n|                           |             |\n| * Reachable objects       |             |\n|   * Count                 |  410.45 k   |\n|     * Commits             |   83.99 k   |\n|     * Trees               |  164.46 k   |\n|     * Blobs               |  161.00 k   |\n|     * Tags                |    1.00 k   |\n|   * Inflated size         |    7.46 GiB |\n|     * Commits             |   57.53 MiB |\n|     * Trees               |    2.33 GiB |\n|     * Blobs               |    5.07 GiB |\n|     * Tags                |  737.48 KiB |\n|   * Disk size             |  181.37 MiB |\n|     * Commits             |   33.11 MiB |\n|     * Trees               |   40.58 MiB |\n|     * Blobs               |  107.11 MiB |\n|     * Tags                |  582.67 KiB |\n|                           |             |\n| * Largest objects         |             |\n|   * Commits               |             |\n|     * Maximum size    [1] |   17.23 KiB |\n|     * Maximum parents [2] |      10     |\n|   * Trees                 |             |\n|     * Maximum size    [3] |   58.85 KiB |\n|     * Maximum entries [4] |    1.18 k   |\n|   * Blobs                 |             |\n|     * Maximum size    [5] | 1019.51 KiB |\n|   * Tags                  |             |\n\n|     * Maximum size    [6] |    7.13 KiB |\n\n[1] f6ecb603ff8af608a417d7724727d6bc3a9dbfdf\n[2] 16d7601e176cd53f3c2f02367698d06b85e08879\n[3] 203ee97047731b9fd3ad220faa607b6677861a0d\n[4] 203ee97047731b9fd3ad220faa607b6677861a0d\n[5] aa96f8bc361fd84a1459440f1e7de02ab0dc3543\n[6] 07e38db6a5a03690034d27104401f6c8ea40f1fc\n```\n\nWith this information we're now almost feature-complete as compared to git-sizer(1). We're not done yet, though — we plan to eventually add additional features such as:\n\n- Severity levels as they exist in git-sizer(1)\n- Graphs that show you the distribution of object sizes\n- The ability to scan objects reachable via a subset of references\n\n*This project was led by [Justin Tobler](https://gitlab.com/justintobler).*\n\n## New infrastructure for repository maintenance\n\nWhenever you write data into a Git repository you will typically end up adding more loose objects. Left unmanaged, this leads to a large number of separate files in your `.git/objects/` directory, which slows down several operations that want to access many objects at once. Git thus regularly packs these objects into \"packfiles\" to ensure good performance.\n\nThis isn't the only data structure that may become inefficient over time: Updating references may create loose references, reflogs will need trimming, worktrees may become stale, and caches like commit-graphs need to be refreshed regularly.\n\nAll of these tasks have historically been managed by [git-gc(1)](https://git-scm.com/docs/git-gc). However, this tool has a monolithic architecture, where it basically executes all of the tasks required in sequential order. This foundation is hard to extend and doesn't give the end user much flexibility in case they want to slightly modify how housekeeping is performed.\n\nThe Git project introduced the new [git-maintenance(1)](https://git-scm.com/docs/git-maintenance) tool in Git 2.29. In contrast to git-gc(1), git-maintenance(1) is not monolithic but is instead structured around tasks. These tasks are freely configurable by the user so that the user can control which tasks are running, giving them much more fine-grained control over repository maintenance.\n\nEventually, Git has migrated to use git-maintenance(1) by default. But in the beginning, the only task that was default-enabled was the git-gc(1) task, which as you might have guessed, simply executes `git gc`. To manually run maintenance using this new command you can execute `git maintenance run`, but Git knows to execute this automatically after several other commands.\n\nOver the last couple releases we have implemented all the individual tasks that are supported by git-gc(1) in git-maintenance(1) to ensure that we have feature parity between these two tools.\n\nFurthermore, we have implemented a new task that uses Git's modern architecture for repacking objects with [geometric compaction](https://git-scm.com/docs/git-repack#Documentation/git-repack.txt---geometricfactor).\nGeometric compaction is a much better fit for large monorepos, and with our efforts to make them work well with partial clones [that landed in Git 2.53](https://about.gitlab.com/blog/whats-new-in-git-2-53-0/#geometric-repacking-support-with-promisor-remotes) they are now a full replacement for our previous repacking strategy in Git.\n\nIn Git 2.54, we have now reached another significant milestone: Instead of using the git-gc(1)-based strategy by default, we are now using geometric repacking with fine-grained individual maintenance tasks! Besides being more efficient for large monorepos, it also ensures that we have an easier foundation to iterate on going forward.\n\n*The git-maintenance(1) infrastructure was originally implemented by [Derrick Stolee](https://github.com/derrickstolee) and geometric maintenance was introduced by [Taylor Blau](https://github.com/ttaylorr). The effort to introduce the new fine-grained tasks and migrate to the new maintenance strategy was led by [Patrick Steinhardt](https://gitlab.com/pks-gitlab).*\n\n## Read more\n\nThis article highlighted just a few of the contributions made by GitLab and the wider Git community for this latest release. You can learn about these from the [official release announcement](https://lore.kernel.org/git/xmqqa4uxsjrs.fsf@gitster.g/T/#u) of the Git project. Also, check out our [previous Git release blog posts](https://about.gitlab.com/blog/tags/git/) to see other past highlights of contributions from GitLab team members.",{"slug":741,"featured":14,"template":15},"whats-new-in-git-2-54-0",{"content":743,"config":752},{"title":744,"description":745,"authors":746,"date":748,"body":749,"heroImage":750,"category":11,"tags":751},"What’s new in Git 2.53.0?","Learn about release contributions, including fixes for geometric repacking, updates to git-fast-import(1) commit signature handing options, and more.",[747],"Justin Tobler","2026-02-02","The Git project recently released [Git 2.53.0](https://lore.kernel.org/git/xmqq4inz13e3.fsf@gitster.g/T/#u). Let's look at a few notable highlights from this release, which includes\ncontributions from the Git team at GitLab.\n\n## Geometric repacking support with promisor remotes\n\nNewly written objects in a Git repository are often stored as individual loose files. To ensure good performance and optimal use of disk space, these loose objects are regularly compressed into so-called packfiles. The number of packfiles in a repository grows over time as a result of the user’s activities, like writing new commits or fetching from a remote. As the number of packfiles in a repository increases, Git has to do more work to look up individual objects. Therefore, to preserve optimal repository performance, packfiles are periodically repacked via git-repack(1) to consolidate the objects into fewer packfiles. When repacking there are two strategies: “all-into-one” and “geometric”.\n\nThe all-into-one strategy is fairly straightforward and the current default. As its name implies, all objects in the repository are packed into a single packfile. From a performance perspective this is great for the repository as Git only has to scan through a single packfile when looking up objects. The main downside of such a repacking strategy is that computing a single packfile for a repository can take a significant amount of time for large repositories.\n\nThe geometric strategy helps mitigate this concern by maintaining a geometric progression of packfiles based on their size instead of always repacking into a single packfile. To explain more plainly, when repacking Git maintains a set of packfiles ordered by size where each packfile in the sequence is expected to be at least twice the size of the preceding packfile. If a packfile in the sequence violates this property, packfiles are combined as needed until the progression is restored. This strategy has the advantage of still minimizing the number of packfiles in a repository while also minimizing the amount of work that must be done for most repacking operations.\n\nOne problem with the geometric repacking strategy was that it was not compatible with partial clones. Partial clones allow the user to clone only parts of a repository by, for example, skipping all blobs larger than 1 megabyte. This can significantly reduce the size of a repository, and Git knows how to backfill missing objects that it needs to access at a later point in time.\n\nThe result is a repository that is missing some objects, and any object that may not be fully connected is stored in a “promisor” packfile.  When repacking, this promisor property needs to be retained going forward for packfiles containing a promisor object so it is known whether a missing object is expected and can be backfilled from the promisor remote. With an all-into-one repack, Git knows how to handle promisor objects properly and stores them in a separate promisor packfile. Unfortunately, the geometric repacking strategy did not know to give special treatment to promisor packfiles and instead would merge them with normal packfiles without considering whether they reference promisor objects. Luckily, due to a bug the underlying git-pack-objects(1) dies when using geometric repacking in a partial clone repository. So this means repositories in this configuration were not able to be repacked anyways which isn’t great, but better than repository corruption.\n\nWith the release of Git 2.53, geometric repacking now works with partial clone repositories. When performing a geometric repack, promisor packfiles are handled separately in order to preserve the promisor marker and repacked following a separate geometric progression. With this fix, the geometric strategy moves closer towards becoming the default repacking strategy. For more information check out the corresponding [mailing list thread](https://lore.kernel.org/git/20260105-pks-geometric-repack-with-promisors-v1-0-c4660573437e@pks.im/).\n\nThis project was led by [Patrick Steinhardt](https://gitlab.com/pks-gitlab).\n\n## git-fast-import(1) learned to preserve only valid signatures\n\nIn our [Git 2.52 release article](https://about.gitlab.com/blog/whats-new-in-git-2-52-0/), we covered signature related improvements to git-fast-import(1) and git-fast-export(1). Be sure to check out that post for a more detailed explanation of these commands, how they are used, and the changes being made with regards to signatures.\n\nTo quickly recap, git-fast-import(1) provides a backend to efficiently import data into a repository and is used by tools such as [git-filter-repo(1)](https://github.com/newren/git-filter-repo) to help rewrite the history of a repository in bulk. In the Git 2.52 release, git-fast-import(1) learned the `--signed-commits=\u003Cmode>` option similar to the same option in git-fast-export(1). With this option, it became possible to unconditionally retain or strip signatures from commits/tags.\n\nIn situations where only part of the repository history has been rewritten, any signature for rewritten commits/tags becomes invalid. This means git-fast-import(1) is limited to either stripping all signatures or keeping all signatures even if they have become invalid. But retaining invalid signatures doesn’t make much sense, so rewriting history with git-repo-filter(1) results in all signatures being stripped, even if the underlying commit/tag is not rewritten. This is unfortunate because if the commit/tag is unchanged, its signature is still valid and thus there is no real reason to strip it. What is really needed is a means to preserve signatures for unchanged objects, but strip invalid ones.\n\nWith the release of Git 2.53, the git-fast-import(1) `--signed-commits=\u003Cmode>` option has learned a new `strip-if-invalid` mode which, when specified, only strips signatures from commits that become invalid due to being rewritten. Thus, with this option it becomes possible to preserve some commit signatures when using git-fast-import(1). This is a critical step towards providing the foundation for tools like git-repo-filter(1) to preserve valid signatures and eventually re-sign invalid signatures.\n\nThis project was led by [Christian Couder](https://gitlab.com/chriscool).\n\n## More data collected in git-repo-structure\n\nIn the Git 2.52 release, the “structure” subcommand was introduced to git-repo(1). The intent of this command was to collect information about the repository and eventually become a native replacement for tools such as [git-sizer(1)](https://github.com/github/git-sizer). At GitLab, we host some extremely large repositories, and having insight into the general structure of a repository is critical to understand its performance characteristics. In this release, the command now also collects total size information for reachable objects in a repository to help understand the overall size of the repository. In the output below, you can see the command now collects both the total inflated and disk sizes of reachable objects by object type.\n\n```shell\n$ git repo structure\n\n| Repository structure | Value      |\n| -------------------- | ---------- |\n| * References         |            |\n|   * Count            |   1.78 k   |\n|     * Branches       |      5     |\n|     * Tags           |   1.03 k   |\n|     * Remotes        |    749     |\n|     * Others         |      0     |\n|                      |            |\n| * Reachable objects  |            |\n|   * Count            | 421.37 k   |\n|     * Commits        |  88.03 k   |\n|     * Trees          | 169.95 k   |\n|     * Blobs          | 162.40 k   |\n|     * Tags           |    994     |\n|   * Inflated size    |   7.61 GiB |\n|     * Commits        |  60.95 MiB |\n|     * Trees          |   2.44 GiB |\n|     * Blobs          |   5.11 GiB |\n|     * Tags           | 731.73 KiB |\n|   * Disk size        | 301.50 MiB |\n|     * Commits        |  33.57 MiB |\n|     * Trees          |  77.92 MiB |\n|     * Blobs          | 189.44 MiB |\n|     * Tags           | 578.13 KiB |\n```\n\nThe keen-eyed among you may have also noticed that the size values in the table output are also now listed in a more human-friendly manner with units appended. In subsequent releases we hope to further expand this command's output to provide additional data points such as the largest individual objects in the repository.\n\nThis project was led by [Justin Tobler](https://gitlab.com/justintobler).\n\n## Read more\n\nThis article highlighted just a few of the contributions made by GitLab and\nthe wider Git community for this latest release. You can learn about these from\nthe [official release announcement](https://lore.kernel.org/git/xmqq4inz13e3.fsf@gitster.g/T/#u) of the Git project. Also, check\nout our [previous Git release blog posts](https://about.gitlab.com/blog/tags/git/)\nto see other past highlights of contributions from GitLab team members.","https://res.cloudinary.com/about-gitlab-com/image/upload/v1749663087/Blog/Hero%20Images/git3-cover.png",[738,22,265],{"featured":27,"template":15,"slug":753},"whats-new-in-git-2-53-0",{"promotions":755},[756,770,782,794],{"id":757,"categories":758,"header":760,"text":761,"button":762,"image":767},"ai-modernization",[759],"ai-ml","Is AI achieving its promise at scale?","Quiz will take 5 minutes or less",{"text":763,"config":764},"Get your AI maturity score",{"href":765,"dataGaName":766,"dataGaLocation":242},"/assessments/ai-modernization-assessment/","modernization assessment",{"config":768},{"src":769},"https://res.cloudinary.com/about-gitlab-com/image/upload/v1772138786/qix0m7kwnd8x2fh1zq49.png",{"id":771,"categories":772,"header":774,"text":761,"button":775,"image":779},"devops-modernization",[773,569],"product","Are you just managing tools or shipping innovation?",{"text":776,"config":777},"Get your DevOps maturity score",{"href":778,"dataGaName":766,"dataGaLocation":242},"/assessments/devops-modernization-assessment/",{"config":780},{"src":781},"https://res.cloudinary.com/about-gitlab-com/image/upload/v1772138785/eg818fmakweyuznttgid.png",{"id":783,"categories":784,"header":786,"text":761,"button":787,"image":791},"security-modernization",[785],"security","Are you trading speed for security?",{"text":788,"config":789},"Get your security maturity score",{"href":790,"dataGaName":766,"dataGaLocation":242},"/assessments/security-modernization-assessment/",{"config":792},{"src":793},"https://res.cloudinary.com/about-gitlab-com/image/upload/v1772138786/p4pbqd9nnjejg5ds6mdk.png",{"id":795,"paths":796,"header":799,"text":800,"button":801,"image":806},"github-azure-migration",[797,798],"migration-from-azure-devops-to-gitlab","integrating-azure-devops-scm-and-gitlab","Is your team ready for GitHub's Azure move?","GitHub is already rebuilding around Azure. Find out what it means for you.",{"text":802,"config":803},"See how GitLab compares to GitHub",{"href":804,"dataGaName":805,"dataGaLocation":242},"/compare/gitlab-vs-github/github-azure-migration/","github azure migration",{"config":807},{"src":781},{"header":809,"blurb":810,"button":811,"secondaryButton":816},"Start building faster today","See what your team can do with the intelligent orchestration platform for DevSecOps.\n",{"text":812,"config":813},"Get your free trial",{"href":814,"dataGaName":49,"dataGaLocation":815},"https://gitlab.com/-/trial_registrations/new?glm_content=default-saas-trial&glm_source=about.gitlab.com/","feature",{"text":505,"config":817},{"href":53,"dataGaName":54,"dataGaLocation":815},1777493633926]