[{"data":1,"prerenderedAt":814},["ShallowReactive",2],{"/en-us/blog/a-beginners-guide-to-the-git-reftable-format":3,"navigation-en-us":38,"banner-en-us":448,"footer-en-us":458,"blog-post-authors-en-us-Patrick Steinhardt":700,"blog-related-posts-en-us-a-beginners-guide-to-the-git-reftable-format":714,"blog-promotions-en-us":750,"next-steps-en-us":804},{"id":4,"title":5,"authorSlugs":6,"body":8,"categorySlug":9,"config":10,"content":14,"description":8,"extension":27,"isFeatured":12,"meta":28,"navigation":12,"path":29,"publishedDate":20,"seo":30,"stem":35,"tagSlugs":36,"__hash__":37},"blogPosts/en-us/blog/a-beginners-guide-to-the-git-reftable-format.yml","A Beginners Guide To The Git Reftable Format",[7],"patrick-steinhardt",null,"open-source",{"slug":11,"featured":12,"template":13},"a-beginners-guide-to-the-git-reftable-format",true,"BlogPost",{"title":15,"description":16,"authors":17,"heroImage":19,"date":20,"body":21,"category":9,"tags":22},"A beginner's guide to the Git reftable format","In Git 2.45.0, GitLab upstreamed the reftable backend to Git, which completely changes how references are stored. Get an in-depth look at the inner workings of this new format.",[18],"Patrick Steinhardt","https://res.cloudinary.com/about-gitlab-com/image/upload/v1749664595/Blog/Hero%20Images/blog-image-template-1800x945__9_.png","2024-05-30","Until recently, the \"files\" format was the only way for Git to store references. With the [release of Git 2.45.0](https://about.gitlab.com/blog/whats-new-in-git-2-45-0/), Git can now store references in a \"reftable\" format. This new format is a binary format that is quite a bit more complex, but that complexity allows it to address several shortcomings of the \"files\" format. The design goals for the \"reftable\" format include:\n\n- Make the lookup of a single reference and iteration through ranges of references as efficient and fast as possible.\n- Support for consistent reads of references so that Git never reads an in-between state when an update to multiple references has been applied only partially.\n- Support for atomic writes such that updating multiple references can be implemented as an all-or-nothing operation.\n- Efficient storage of both refs and the reflog.\n\nIn this article, we will go under the hood of the \"reftable\" format to see exactly how it works.\n\n## How Git stores references\n\nBefore we dive into the details of the \"reftable\" format, let's quickly recap how Git has historically stored references. If you are already familiar with this, you can skip this section.\n\nA Git repository keeps track of two important data structures:\n\n- [Objects](https://git-scm.com/book/en/v2/Git-Internals-Git-Objects), which contain the actual data of your repository. This includes commits, the directory tree structure, and the blobs that contain your source code. Objects point to each other, forming an object graph. Furthermore, each object has an object ID that uniquely identifies the object.\n\n- References, such as branches and tags, which are pointers into the object graph so that you can give objects names that are easier to remember and keep track of different tracks of your development history. For example, a repository may contain a `main` branch, which is a reference named `refs/heads/main` that points to a specific commit.\n\nReferences are stored in the reference database. Until Git 2.45.0, there was only the \"files\" database format. In this format, every reference is stored as a normal file that contains either one of the following:\n\n- A regular reference that contains the object ID of the commit it points to.\n- A symbolic reference that contains the name of another reference, similar to how a symbolic link points to another file.\n\nAt regular intervals, these references get packed into a single `packed-refs` file to make lookups more efficient.\n\nThe following examples should give an idea of how the \"files\" format operates:\n\n```shell\n$ git init .\n$ git commit --allow-empty --message \"Initial commit\"\n[main (root-commit) 6917c17] Initial commit\n\n# HEAD is a symbolic reference pointing to refs/heads/main.\n$ cat .git/HEAD\nref: refs/heads/main\n\n# refs/heads/main is a regular reference pointing to a commit.\n$ cat .git/refs/heads/main\n6917c178cfc3c50215a82cf959204e9934af24c8\n\n# git-pack-refs(1) packs these references into the packed-refs file.\n$ git pack-refs --all\n$ cat .git/packed-refs\n# pack-refs with: peeled fully-peeled sorted\n6917c178cfc3c50215a82cf959204e9934af24c8 refs/heads/main\n```\n\n## High-level structure of reftables\n\nAssuming that you've got Git 2.45.0 or newer installed, you can create a repository with the \"reftable\" format by using the `--ref-format=reftable` switch:\n\n```shell\n$ git init --ref-format=reftable .\nInitialized empty Git repository in /tmp/repo/.git/\n$ git rev-parse --show-ref-format\nreftable\n\n# Irrelevant files have been removed for ease of understanding.\n$ tree .git\n.git\n├── config\n├── HEAD\n├── index\n├── objects\n├── refs\n│   └── heads\n└── reftable\n\t├── 0x000000000001-0x000000000002-40a482a9.ref\n\t└── tables.list\n\n4 directories, 6 files\n```\n\nFirst, looking at the repository configuration, you will see it has an `extension.refstorage` key:\n\n```shell\n$ cat .git/config\n[core]\n    repositoryformatversion = 1\n    filemode = true\n    bare = false\n    logallrefupdates = true\n[extensions]\n    refstorage = reftable\n```\n\nThis configuration indicates to Git that the repository has been initialized with the \"reftable\" format and tells Git to use the \"reftable\" backend to access it.\n\nWeirdly enough, the repository still has a few files that look as if the \"files\" backend was in use:\n\n- `HEAD` would usually be a symbolic reference pointing to your currently checked-out branch. While it is not used by the \"reftable\" backend, it is required for Git clients to detect the directory as a Git repository. Therefore, when using the \"reftable\" format, `HEAD` is a stub with contents `ref: refs/heads/.invalid`.\n\n- `refs/heads` is a file with contents `this repository uses the reftable format`. Git clients that do not know about the \"reftable\" format would usually expect this path to be a directory. Consequently, creating this path as a file intentionally causes such older Git clients to fail if they tried to access the repository with the \"files\" backend.\n\nThe actual references are stored in the `reftable/` directory:\n\n```shell\n$ tree .git/reftable\n.git/reftable/\n├── 0x000000000001-0x000000000001-794bd722.ref\n└── tables.list\n\n$ cat .git/reftable/tables.list\n0x000000000001-0x000000000001-794bd722.ref\n```\n\nThere are two files here:\n\n- `0x000000000001-0x000000000001-794bd722.ref` is a table containing references and the reflog data in a binary format.\n\n- `tables.list` is, well, a list of tables. In the current state of the repository, the file contains a single line, which is the name of the table. This file tracks the current set of active tables in the \"reftable\" database and is updated whenever new tables get added to the repository.\n\nUpdating a reference creates a new table:\n\n```shell\n$ git commit --allow-empty --message \"Initial commit\"\n[main (root-commit) 1472a58] Initial commit\n\n$ tree .git/reftable\n.git/reftable/\n├── 0x000000000001-0x000000000002-eb87d12b.ref\n└── tables.list\n\n$ cat .git/reftable/tables.list\n0x000000000001-0x000000000002-eb87d12b.ref\n```\n\nAs you can see, the previous table has been replaced with a new one. Furthermore, the `tables.list` file has been updated to contain the new table.\n\n## The structure of a table\n\nAs mentioned earlier, the actual data of the reference database is contained in tables. Roughly speaking, a table is split up into multiple sections:\n\n- The \"header\" contains metadata about the table. Along with some other information, this includes the version of the format, the block size, and the hash function used by the repository (for example, SHA1 or SHA256).\n- The \"ref\" section contains your references. These records have a key that equals the reference name and point to either an object ID for regular references, or to another reference for symbolic references.\n- The \"obj\" section contains reverse mapping from object IDs to the references that point to those object IDs. These allow Git to efficiently look up which references point to a given object ID.\n- The \"log\" section contains your reflog entries. These records have a key that equals the reference name plus an index that represents the number of the log entry. Furthermore, they contain the old and new object IDs as well as the message for that reflog entry.\n- The \"footer\" contains offsets to the various sections.\n\n![long table with all the reftable sections](https://res.cloudinary.com/about-gitlab-com/image/upload/v1749675179/Blog/Content%20Images/Frame_1_-_Reftable_overview.svg)\n\nEach of the section types are structured in a similar manner. Sections contain a set of records that are sorted by each record's key. For example, when you have two ref records `refs/heads/aaaaa` and `refs/heads/bbb`, you have two ref records with these reference names as their respective keys, and `refs/heads/aaaaa` would come before `refs/heads/bbb`.\n\nFurthermore, each section is divided into blocks of a fixed length. This block length is encoded in the header and serves two purposes:\n\n- Given the start of the section as well as the block size, the reader implicitly knows where each of the blocks starts. This allows Git to easily seek into the middle of a section without reading preceding blocks, which enables binary searches over blocks to speed up the lookup of records.\n- It ensures that the reader knows how much data to read from the disk at a time. Consequently, the block size is by default set to 4KiB, which is the most common sector size for hard disks. The maximum block size is 16MB.\n\nWhen we peek into, for example, a \"ref\" section, it looks roughly like the following graphic. Note how its records are ordered lexicographically inside the blocks, but also across the blocks.\n\n![reference block uncompressed](https://res.cloudinary.com/about-gitlab-com/image/upload/v1749675179/Blog/Content%20Images/Frame_2_-_Ref_block_uncompressed.svg)\n\nEquipped with the current information, we can locate a record by using the following steps:\n\n1. Perform a binary search over the blocks by looking at the keys of their respective first records, identifying the block that must contain our record.\n\n2. Perform a linear search over the records in that block.\n\nBoth of these steps are still somewhat inefficient. If we have many blocks we may have to read logarithmically many of them in our binary search to find the desired one. And when blocks contain many records, we potentially have to read all of them during the linear search.\n\nThe \"reftable\" format has additional built-in mechanisms to address these performance concerns. We will touch on these over the next few sections.\n\n### Prefix compression\n\nAs you may have noticed, all of the record keys share the same prefix `refs/`. This is a common thing in Git:\n\n- All branches start with `refs/heads/`.\n- All tags start with `refs/tags/`.\n\nTherefore, we expect that subsequent records will most likely share a significant prefix of their key. This is a good opportunity to save some precious disk space. Because we know that most keys will share a common prefix, it makes sense to optimize for this.\n\nThe optimization uses prefix compression. Every record encodes a prefix length that tells the reader how many bytes to reuse from the key of the preceding record. If we have two records, `refs/heads/a` and `refs/heads/b`, the latter can be encoded by specifying a prefix length of 11 and then only storing the suffix `b`. The reader will then take the first 11 bytes of `refs/heads/a`, which is `refs/heads/`, and append the suffix `b` to it.\n\n![prefix compression](https://res.cloudinary.com/about-gitlab-com/image/upload/v1749675179/Blog/Content%20Images/Frame_3_-_Ref_block_prefix_compression.svg)\n\n### Restart points\n\nAs explained earlier, the best way to search for a reference in a block with our current understanding of the \"reftable\" format is to do a linear search. This is because records do not have a fixed length, so it is impossible for us to tell where records would start without scanning through the block from the beginning. Also, even if records were of fixed length, we would not be able to seek into the middle of a block because the prefix compression also requires us to read preceding records.\n\nDoing a linear search would be quite inefficient because blocks may contain hundreds or even thousands of records. To address this issue, the \"reftable\" format encodes so-called restart points into every block. Restart points are uncompressed records where the prefix compression is reset. Consequently, records at restart points always contain their full key and it becomes possible to directly seek to and read the record without having to read preceding records. These restart points are listed in the footer of each block.\n\nEquipped with this information, we can avoid performing a linear search over the block. Instead, we can now do a binary search over the restart points where we search for the first restart point with a key larger than the sought-after key. From there, it follows that the desired record must be located in the section spanning from the _preceding_ restart point to the identified one.\n\nThus, our initial procedure to look up a record (binary search for the block, linear search for the record) is now:\n\n1. Perform a binary search over the blocks, identifying the block that must contain our record.\n\n2. Perform a binary search over the restart points, identifying the sub-section of the block that must contain our record.\n\n3. Perform a linear search over the records in that sub-section.\n\n![Linear search for a record](https://res.cloudinary.com/about-gitlab-com/image/upload/v1749675179/Blog/Content%20Images/Frame_4_-_Restart_points.svg)\n\n### Indices\n\nWhile the search for records inside a block is now reasonably efficient, it's still inefficient to locate the block itself. A binary search may be reasonably performant when you have a couple of blocks, but repositories with millions of references may have hundreds or even thousands of blocks. Without any additional data structure, this would cause logarithmically many disk seeks on average.\n\nTo avoid this, every section may be followed by an index section that provides an efficient way to look up a block. Each index record holds the following information:\n\n- The location of the block that it is indexing.\n- The key of the last record of the block that it is indexing.\n\nWith three or less blocks, a binary search will always require, at most, two disk reads to find the desired target block. This is the same number of reads we would have to do with an index: one to read the index itself and one to read the desired block. Consequently, indices are only written when they would actually save some reads, which is the case with four or more indexed blocks.\n\nNow the question is: What happens when the index itself becomes so large that it spans over multiple blocks? You might have guessed it: We write another index that indexes the index. These multi-level indices really only become necessary once you have repositories with hundreds of thousands of references.\n\nEquipped with these indices, we can now make the procedure to look up records even more efficient:\n1. Determine whether there is an index by looking at the footer of the table.\n\t- If there is one, perform a binary search over the index to find the desired block. This block may point into an index block itself, in which case we need to repeat this step until we hit a record of the desired type.\n\t- Otherwise, perform a binary search over the blocks as we did before.\n2. Perform a binary search over the restart points, identifying the sub-section of the block that must contain our record.\n3. Perform a linear search over the records in that sub-section.\n\n## Multiple tables\n\nUp to this point, we have only discussed how to read a _single_ table. But as the name `tables.list` indicates, you can actually have a list of tables in your \"reftable\" database.\n\nEvery time you update a reference in your repository, a new table is written and appended to `tables.list`. Thus, you will eventually end up with multiple tables:\n\n```shell\n$ tree .git/reftable/\n.git/reftable/\n├── 0x000000000001-0x000000000007-8dcd8a77.ref\n├── 0x000000000008-0x000000000008-30e0f6f6.ref\n└── tables.list\n\n$ cat .git/reftable/tables.list\n0x000000000001-0x000000000007-8dcd8a77.ref\n0x000000000008-0x000000000008-30e0f6f6.ref\n```\n\nReading the actual state of a repository requires us to merge these multiple tables into a single virtual table.\n\nYou might be wondering: If a table is written for each reference update and the same reference is updated multiple times, how does the \"reftable\" format know the most up-to-date value of a given reference? Intuitively, one could assume the value would be the one from the newest table containing the reference.\n\nIn fact, every single record has a so-called update index that encodes the \"priority\" of a record. For example, if two ref records with the same name exist, then the one with the higher update index overrides the one with the lower update index.\n\nThese update indices are visible in the file structure above. The long hex strings (for example `0x000000000001`) are the update indices, where the left-hand side of the table name is the minimum update index contained in the table and the right-hand is the maximum update index.\n\nMerging the tables then happens via a [priority queue](https://en.wikipedia.org/wiki/Priority_queue) that is ordered by the key of the ref record as well as its update index. Assuming we want to scan through all ref records, we would:\n\n1. For every table, add its first record to the priority queue.\n\n![Adding first record to the priority queue](https://res.cloudinary.com/about-gitlab-com/image/upload/v1749675179/Blog/Content%20Images/Frame_5_-_Priority_queue_1.svg)\n\n2. Yield the head of the priority queue. Because the queue is ordered by update index, it must be the most up-to-date version. Add the next item from that table to the priority queue.\n\n![Yielding the head of the priority queue](https://res.cloudinary.com/about-gitlab-com/image/upload/v1749675179/Blog/Content%20Images/Frame_6_-_Priority_queue_2.svg)\n\n3. Drop all records from the queue that have the same name. These records are shadowed, which means that they will not be shown. For each table for which we are dropping records, add the next record to the priority queue.\n\n![Dropping all records from queue that have the same name](https://res.cloudinary.com/about-gitlab-com/image/upload/v1749675179/Blog/Content%20Images/Frame_7_-_Priority_queue_3.svg)\n\nNow we can rinse and repeat to read records for other keys.\n\nTables may contain special \"tombstone\" records that mark a record as having been deleted. This allows us to delete records without having to rewrite all tables to not contain the record anymore.\n\n### Auto-compaction\n\nWhile the idea behind the priority queue is simple enough, it would be rather inefficient to merge together hundreds or even only dozens of tables in this way. So while it is true that every update to your references appends a new table to your `tables.list` file, it is only part of the story.\n\nThe other part is auto-compaction: After a new table has been appended to the list of tables, the \"reftable\" backend checks whether some of the tables should be merged. This is done by using a simple heuristic: We check whether the list of tables forms a [geometric sequence](https://en.wikipedia.org/wiki/Geometric_progression) with the file sizes. Every table `n` must be at least twice as large as the next-most-recent table `n + 1`. If that geometric sequence is violated, the backend will compact tables so that the geometric sequence is restored.\n\nOver time, this will lead to structures that look like the following:\n\n```shell\n$ du --apparent-size .git/reftable/*\n429    .git/reftable/0x000000000001-0x00000000bd7c-d9819000.ref\n101    .git/reftable/0x00000000bd7d-0x00000000c5ac-c34b88a4.ref\n32    .git/reftable/0x00000000c5ad-0x00000000cc6c-60391f53.ref\n8    .git/reftable/0x00000000cc6d-0x00000000cdc1-61c30db1.ref\n3    .git/reftable/0x00000000cdc2-0x00000000ce67-d9b55a96.ref\n1    .git/reftable/0x00000000ce68-0x00000000ce6b-44721696.ref\n1    .git/reftable/tables.list\n```\n\nNote how for every single table, the property `size(n) > size(n+1) * 2` holds.\n\nOne of the consequences of auto-compaction is that the \"reftable\" backend maintains itself. We no longer have to run `git pack-refs` in a repository.\n\n## Want to learn more?\n\nYou should now have a good understanding of how the new \"reftable\" format works under the hood. If you want to dive even deeper into the format, you can refer to the [technical documentation](https://git-scm.com/docs/reftable) provided by the Git project.\n\n> Read our [Git 2.45.0 recap](https://about.gitlab.com/blog/whats-new-in-git-2-45-0/) to find out what else is in this version of Git.",[23,24,25,26],"git","tutorial","open source","performance","yml",{},"/en-us/blog/a-beginners-guide-to-the-git-reftable-format",{"title":15,"description":16,"ogTitle":15,"ogDescription":16,"noIndex":31,"ogImage":19,"ogUrl":32,"ogSiteName":33,"ogType":34,"canonicalUrls":32},false,"https://about.gitlab.com/blog/a-beginners-guide-to-the-git-reftable-format","https://about.gitlab.com","article","en-us/blog/a-beginners-guide-to-the-git-reftable-format",[23,24,9,26],"6rh0E9_hVOKO2NuNe8_IMVhDl5iBcSDhbVXCWDatuAE",{"data":39},{"logo":40,"freeTrial":45,"sales":50,"login":55,"items":60,"search":368,"minimal":399,"duo":418,"switchNav":427,"pricingDeployment":438},{"config":41},{"href":42,"dataGaName":43,"dataGaLocation":44},"/","gitlab logo","header",{"text":46,"config":47},"Get free trial",{"href":48,"dataGaName":49,"dataGaLocation":44},"https://gitlab.com/-/trial_registrations/new?glm_source=about.gitlab.com&glm_content=default-saas-trial/","free trial",{"text":51,"config":52},"Talk to sales",{"href":53,"dataGaName":54,"dataGaLocation":44},"/sales/","sales",{"text":56,"config":57},"Sign in",{"href":58,"dataGaName":59,"dataGaLocation":44},"https://gitlab.com/users/sign_in/","sign in",[61,88,183,188,289,349],{"text":62,"config":63,"cards":65},"Platform",{"dataNavLevelOne":64},"platform",[66,72,80],{"title":62,"description":67,"link":68},"The intelligent orchestration platform for DevSecOps",{"text":69,"config":70},"Explore our Platform",{"href":71,"dataGaName":64,"dataGaLocation":44},"/platform/",{"title":73,"description":74,"link":75},"GitLab Duo Agent Platform","Agentic AI for the entire software lifecycle",{"text":76,"config":77},"Meet GitLab Duo",{"href":78,"dataGaName":79,"dataGaLocation":44},"/gitlab-duo-agent-platform/","gitlab duo agent platform",{"title":81,"description":82,"link":83},"Why GitLab","See the top reasons enterprises choose GitLab",{"text":84,"config":85},"Learn more",{"href":86,"dataGaName":87,"dataGaLocation":44},"/why-gitlab/","why gitlab",{"text":89,"left":12,"config":90,"link":92,"lists":96,"footer":165},"Product",{"dataNavLevelOne":91},"solutions",{"text":93,"config":94},"View all Solutions",{"href":95,"dataGaName":91,"dataGaLocation":44},"/solutions/",[97,121,144],{"title":98,"description":99,"link":100,"items":105},"Automation","CI/CD and automation to accelerate deployment",{"config":101},{"icon":102,"href":103,"dataGaName":104,"dataGaLocation":44},"AutomatedCodeAlt","/solutions/delivery-automation/","automated software delivery",[106,110,113,117],{"text":107,"config":108},"CI/CD",{"href":109,"dataGaLocation":44,"dataGaName":107},"/solutions/continuous-integration/",{"text":73,"config":111},{"href":78,"dataGaLocation":44,"dataGaName":112},"gitlab duo agent platform - product menu",{"text":114,"config":115},"Source Code Management",{"href":116,"dataGaLocation":44,"dataGaName":114},"/solutions/source-code-management/",{"text":118,"config":119},"Automated Software Delivery",{"href":103,"dataGaLocation":44,"dataGaName":120},"Automated software delivery",{"title":122,"description":123,"link":124,"items":129},"Security","Deliver code faster without compromising security",{"config":125},{"href":126,"dataGaName":127,"dataGaLocation":44,"icon":128},"/solutions/application-security-testing/","security and compliance","ShieldCheckLight",[130,134,139],{"text":131,"config":132},"Application Security Testing",{"href":126,"dataGaName":133,"dataGaLocation":44},"Application security testing",{"text":135,"config":136},"Software Supply Chain Security",{"href":137,"dataGaLocation":44,"dataGaName":138},"/solutions/supply-chain/","Software supply chain security",{"text":140,"config":141},"Software Compliance",{"href":142,"dataGaName":143,"dataGaLocation":44},"/solutions/software-compliance/","software compliance",{"title":145,"link":146,"items":151},"Measurement",{"config":147},{"icon":148,"href":149,"dataGaName":150,"dataGaLocation":44},"DigitalTransformation","/solutions/visibility-measurement/","visibility and measurement",[152,156,160],{"text":153,"config":154},"Visibility & Measurement",{"href":149,"dataGaLocation":44,"dataGaName":155},"Visibility and Measurement",{"text":157,"config":158},"Value Stream Management",{"href":159,"dataGaLocation":44,"dataGaName":157},"/solutions/value-stream-management/",{"text":161,"config":162},"Analytics & Insights",{"href":163,"dataGaLocation":44,"dataGaName":164},"/solutions/analytics-and-insights/","Analytics and insights",{"title":166,"items":167},"GitLab for",[168,173,178],{"text":169,"config":170},"Enterprise",{"href":171,"dataGaLocation":44,"dataGaName":172},"/enterprise/","enterprise",{"text":174,"config":175},"Small Business",{"href":176,"dataGaLocation":44,"dataGaName":177},"/small-business/","small business",{"text":179,"config":180},"Public Sector",{"href":181,"dataGaLocation":44,"dataGaName":182},"/solutions/public-sector/","public sector",{"text":184,"config":185},"Pricing",{"href":186,"dataGaName":187,"dataGaLocation":44,"dataNavLevelOne":187},"/pricing/","pricing",{"text":189,"config":190,"link":192,"lists":196,"feature":276},"Resources",{"dataNavLevelOne":191},"resources",{"text":193,"config":194},"View all resources",{"href":195,"dataGaName":191,"dataGaLocation":44},"/resources/",[197,230,248],{"title":198,"items":199},"Getting started",[200,205,210,215,220,225],{"text":201,"config":202},"Install",{"href":203,"dataGaName":204,"dataGaLocation":44},"/install/","install",{"text":206,"config":207},"Quick start guides",{"href":208,"dataGaName":209,"dataGaLocation":44},"/get-started/","quick setup checklists",{"text":211,"config":212},"Learn",{"href":213,"dataGaLocation":44,"dataGaName":214},"https://university.gitlab.com/","learn",{"text":216,"config":217},"Product documentation",{"href":218,"dataGaName":219,"dataGaLocation":44},"https://docs.gitlab.com/","product documentation",{"text":221,"config":222},"Best practice videos",{"href":223,"dataGaName":224,"dataGaLocation":44},"/getting-started-videos/","best practice videos",{"text":226,"config":227},"Integrations",{"href":228,"dataGaName":229,"dataGaLocation":44},"/integrations/","integrations",{"title":231,"items":232},"Discover",[233,238,243],{"text":234,"config":235},"Customer success stories",{"href":236,"dataGaName":237,"dataGaLocation":44},"/customers/","customer success stories",{"text":239,"config":240},"Blog",{"href":241,"dataGaName":242,"dataGaLocation":44},"/blog/","blog",{"text":244,"config":245},"Remote",{"href":246,"dataGaName":247,"dataGaLocation":44},"https://handbook.gitlab.com/handbook/company/culture/all-remote/","remote",{"title":249,"items":250},"Connect",[251,256,261,266,271],{"text":252,"config":253},"GitLab Services",{"href":254,"dataGaName":255,"dataGaLocation":44},"/services/","services",{"text":257,"config":258},"Community",{"href":259,"dataGaName":260,"dataGaLocation":44},"/community/","community",{"text":262,"config":263},"Forum",{"href":264,"dataGaName":265,"dataGaLocation":44},"https://forum.gitlab.com/","forum",{"text":267,"config":268},"Events",{"href":269,"dataGaName":270,"dataGaLocation":44},"/events/","events",{"text":272,"config":273},"Partners",{"href":274,"dataGaName":275,"dataGaLocation":44},"/partners/","partners",{"backgroundColor":277,"textColor":278,"text":279,"image":280,"link":284},"#2f2a6b","#fff","Insights for the future of software development",{"altText":281,"config":282},"the source promo card",{"src":283},"https://res.cloudinary.com/about-gitlab-com/image/upload/v1758208064/dzl0dbift9xdizyelkk4.svg",{"text":285,"config":286},"Read the latest",{"href":287,"dataGaName":288,"dataGaLocation":44},"/the-source/","the source",{"text":290,"config":291,"lists":293},"Company",{"dataNavLevelOne":292},"company",[294],{"items":295},[296,301,307,309,314,319,324,329,334,339,344],{"text":297,"config":298},"About",{"href":299,"dataGaName":300,"dataGaLocation":44},"/company/","about",{"text":302,"config":303,"footerGa":306},"Jobs",{"href":304,"dataGaName":305,"dataGaLocation":44},"/jobs/","jobs",{"dataGaName":305},{"text":267,"config":308},{"href":269,"dataGaName":270,"dataGaLocation":44},{"text":310,"config":311},"Leadership",{"href":312,"dataGaName":313,"dataGaLocation":44},"/company/team/e-group/","leadership",{"text":315,"config":316},"Team",{"href":317,"dataGaName":318,"dataGaLocation":44},"/company/team/","team",{"text":320,"config":321},"Handbook",{"href":322,"dataGaName":323,"dataGaLocation":44},"https://handbook.gitlab.com/","handbook",{"text":325,"config":326},"Investor relations",{"href":327,"dataGaName":328,"dataGaLocation":44},"https://ir.gitlab.com/","investor relations",{"text":330,"config":331},"Trust Center",{"href":332,"dataGaName":333,"dataGaLocation":44},"/security/","trust center",{"text":335,"config":336},"AI Transparency Center",{"href":337,"dataGaName":338,"dataGaLocation":44},"/ai-transparency-center/","ai transparency center",{"text":340,"config":341},"Newsletter",{"href":342,"dataGaName":343,"dataGaLocation":44},"/company/contact/#contact-forms","newsletter",{"text":345,"config":346},"Press",{"href":347,"dataGaName":348,"dataGaLocation":44},"/press/","press",{"text":350,"config":351,"lists":352},"Contact us",{"dataNavLevelOne":292},[353],{"items":354},[355,358,363],{"text":51,"config":356},{"href":53,"dataGaName":357,"dataGaLocation":44},"talk to sales",{"text":359,"config":360},"Support portal",{"href":361,"dataGaName":362,"dataGaLocation":44},"https://support.gitlab.com","support portal",{"text":364,"config":365},"Customer portal",{"href":366,"dataGaName":367,"dataGaLocation":44},"https://customers.gitlab.com/customers/sign_in/","customer portal",{"close":369,"login":370,"suggestions":377},"Close",{"text":371,"link":372},"To search repositories and projects, login to",{"text":373,"config":374},"gitlab.com",{"href":58,"dataGaName":375,"dataGaLocation":376},"search login","search",{"text":378,"default":379},"Suggestions",[380,382,386,388,392,396],{"text":73,"config":381},{"href":78,"dataGaName":73,"dataGaLocation":376},{"text":383,"config":384},"Code Suggestions (AI)",{"href":385,"dataGaName":383,"dataGaLocation":376},"/solutions/code-suggestions/",{"text":107,"config":387},{"href":109,"dataGaName":107,"dataGaLocation":376},{"text":389,"config":390},"GitLab on AWS",{"href":391,"dataGaName":389,"dataGaLocation":376},"/partners/technology-partners/aws/",{"text":393,"config":394},"GitLab on Google Cloud",{"href":395,"dataGaName":393,"dataGaLocation":376},"/partners/technology-partners/google-cloud-platform/",{"text":397,"config":398},"Why GitLab?",{"href":86,"dataGaName":397,"dataGaLocation":376},{"freeTrial":400,"mobileIcon":405,"desktopIcon":410,"secondaryButton":413},{"text":401,"config":402},"Start free trial",{"href":403,"dataGaName":49,"dataGaLocation":404},"https://gitlab.com/-/trials/new/","nav",{"altText":406,"config":407},"Gitlab Icon",{"src":408,"dataGaName":409,"dataGaLocation":404},"https://res.cloudinary.com/about-gitlab-com/image/upload/v1758203874/jypbw1jx72aexsoohd7x.svg","gitlab icon",{"altText":406,"config":411},{"src":412,"dataGaName":409,"dataGaLocation":404},"https://res.cloudinary.com/about-gitlab-com/image/upload/v1758203875/gs4c8p8opsgvflgkswz9.svg",{"text":414,"config":415},"Get Started",{"href":416,"dataGaName":417,"dataGaLocation":404},"https://gitlab.com/-/trial_registrations/new?glm_source=about.gitlab.com/get-started/","get started",{"freeTrial":419,"mobileIcon":423,"desktopIcon":425},{"text":420,"config":421},"Learn more about GitLab Duo",{"href":78,"dataGaName":422,"dataGaLocation":404},"gitlab duo",{"altText":406,"config":424},{"src":408,"dataGaName":409,"dataGaLocation":404},{"altText":406,"config":426},{"src":412,"dataGaName":409,"dataGaLocation":404},{"button":428,"mobileIcon":433,"desktopIcon":435},{"text":429,"config":430},"/switch",{"href":431,"dataGaName":432,"dataGaLocation":404},"#contact","switch",{"altText":406,"config":434},{"src":408,"dataGaName":409,"dataGaLocation":404},{"altText":406,"config":436},{"src":437,"dataGaName":409,"dataGaLocation":404},"https://res.cloudinary.com/about-gitlab-com/image/upload/v1773335277/ohhpiuoxoldryzrnhfrh.png",{"freeTrial":439,"mobileIcon":444,"desktopIcon":446},{"text":440,"config":441},"Back to pricing",{"href":186,"dataGaName":442,"dataGaLocation":404,"icon":443},"back to pricing","GoBack",{"altText":406,"config":445},{"src":408,"dataGaName":409,"dataGaLocation":404},{"altText":406,"config":447},{"src":412,"dataGaName":409,"dataGaLocation":404},{"title":449,"button":450,"config":455},"See how agentic AI transforms software delivery",{"text":451,"config":452},"Watch GitLab Transcend now",{"href":453,"dataGaName":454,"dataGaLocation":44},"/events/transcend/virtual/","transcend event",{"layout":456,"icon":457,"disabled":12},"release","AiStar",{"data":459},{"text":460,"source":461,"edit":467,"contribute":472,"config":477,"items":482,"minimal":689},"Git is a trademark of Software Freedom Conservancy and our use of 'GitLab' is under license",{"text":462,"config":463},"View page source",{"href":464,"dataGaName":465,"dataGaLocation":466},"https://gitlab.com/gitlab-com/marketing/digital-experience/about-gitlab-com/","page source","footer",{"text":468,"config":469},"Edit this page",{"href":470,"dataGaName":471,"dataGaLocation":466},"https://gitlab.com/gitlab-com/marketing/digital-experience/about-gitlab-com/-/blob/main/content/","web ide",{"text":473,"config":474},"Please contribute",{"href":475,"dataGaName":476,"dataGaLocation":466},"https://gitlab.com/gitlab-com/marketing/digital-experience/about-gitlab-com/-/blob/main/CONTRIBUTING.md/","please contribute",{"twitter":478,"facebook":479,"youtube":480,"linkedin":481},"https://twitter.com/gitlab","https://www.facebook.com/gitlab","https://www.youtube.com/channel/UCnMGQ8QHMAnVIsI3xJrihhg","https://www.linkedin.com/company/gitlab-com",[483,530,584,628,655],{"title":184,"links":484,"subMenu":499},[485,489,494],{"text":486,"config":487},"View plans",{"href":186,"dataGaName":488,"dataGaLocation":466},"view plans",{"text":490,"config":491},"Why Premium?",{"href":492,"dataGaName":493,"dataGaLocation":466},"/pricing/premium/","why premium",{"text":495,"config":496},"Why Ultimate?",{"href":497,"dataGaName":498,"dataGaLocation":466},"/pricing/ultimate/","why ultimate",[500],{"title":501,"links":502},"Contact Us",[503,506,508,510,515,520,525],{"text":504,"config":505},"Contact sales",{"href":53,"dataGaName":54,"dataGaLocation":466},{"text":359,"config":507},{"href":361,"dataGaName":362,"dataGaLocation":466},{"text":364,"config":509},{"href":366,"dataGaName":367,"dataGaLocation":466},{"text":511,"config":512},"Status",{"href":513,"dataGaName":514,"dataGaLocation":466},"https://status.gitlab.com/","status",{"text":516,"config":517},"Terms of use",{"href":518,"dataGaName":519,"dataGaLocation":466},"/terms/","terms of use",{"text":521,"config":522},"Privacy statement",{"href":523,"dataGaName":524,"dataGaLocation":466},"/privacy/","privacy statement",{"text":526,"config":527},"Cookie preferences",{"dataGaName":528,"dataGaLocation":466,"id":529,"isOneTrustButton":12},"cookie preferences","ot-sdk-btn",{"title":89,"links":531,"subMenu":540},[532,536],{"text":533,"config":534},"DevSecOps platform",{"href":71,"dataGaName":535,"dataGaLocation":466},"devsecops platform",{"text":537,"config":538},"AI-Assisted Development",{"href":78,"dataGaName":539,"dataGaLocation":466},"ai-assisted development",[541],{"title":542,"links":543},"Topics",[544,549,554,559,564,569,574,579],{"text":545,"config":546},"CICD",{"href":547,"dataGaName":548,"dataGaLocation":466},"/topics/ci-cd/","cicd",{"text":550,"config":551},"GitOps",{"href":552,"dataGaName":553,"dataGaLocation":466},"/topics/gitops/","gitops",{"text":555,"config":556},"DevOps",{"href":557,"dataGaName":558,"dataGaLocation":466},"/topics/devops/","devops",{"text":560,"config":561},"Version Control",{"href":562,"dataGaName":563,"dataGaLocation":466},"/topics/version-control/","version control",{"text":565,"config":566},"DevSecOps",{"href":567,"dataGaName":568,"dataGaLocation":466},"/topics/devsecops/","devsecops",{"text":570,"config":571},"Cloud Native",{"href":572,"dataGaName":573,"dataGaLocation":466},"/topics/cloud-native/","cloud native",{"text":575,"config":576},"AI for Coding",{"href":577,"dataGaName":578,"dataGaLocation":466},"/topics/devops/ai-for-coding/","ai for coding",{"text":580,"config":581},"Agentic AI",{"href":582,"dataGaName":583,"dataGaLocation":466},"/topics/agentic-ai/","agentic ai",{"title":585,"links":586},"Solutions",[587,589,591,596,600,603,607,610,612,615,618,623],{"text":131,"config":588},{"href":126,"dataGaName":131,"dataGaLocation":466},{"text":120,"config":590},{"href":103,"dataGaName":104,"dataGaLocation":466},{"text":592,"config":593},"Agile development",{"href":594,"dataGaName":595,"dataGaLocation":466},"/solutions/agile-delivery/","agile delivery",{"text":597,"config":598},"SCM",{"href":116,"dataGaName":599,"dataGaLocation":466},"source code management",{"text":545,"config":601},{"href":109,"dataGaName":602,"dataGaLocation":466},"continuous integration & delivery",{"text":604,"config":605},"Value stream management",{"href":159,"dataGaName":606,"dataGaLocation":466},"value stream management",{"text":550,"config":608},{"href":609,"dataGaName":553,"dataGaLocation":466},"/solutions/gitops/",{"text":169,"config":611},{"href":171,"dataGaName":172,"dataGaLocation":466},{"text":613,"config":614},"Small business",{"href":176,"dataGaName":177,"dataGaLocation":466},{"text":616,"config":617},"Public sector",{"href":181,"dataGaName":182,"dataGaLocation":466},{"text":619,"config":620},"Education",{"href":621,"dataGaName":622,"dataGaLocation":466},"/solutions/education/","education",{"text":624,"config":625},"Financial services",{"href":626,"dataGaName":627,"dataGaLocation":466},"/solutions/finance/","financial services",{"title":189,"links":629},[630,632,634,636,639,641,643,645,647,649,651,653],{"text":201,"config":631},{"href":203,"dataGaName":204,"dataGaLocation":466},{"text":206,"config":633},{"href":208,"dataGaName":209,"dataGaLocation":466},{"text":211,"config":635},{"href":213,"dataGaName":214,"dataGaLocation":466},{"text":216,"config":637},{"href":218,"dataGaName":638,"dataGaLocation":466},"docs",{"text":239,"config":640},{"href":241,"dataGaName":242,"dataGaLocation":466},{"text":234,"config":642},{"href":236,"dataGaName":237,"dataGaLocation":466},{"text":244,"config":644},{"href":246,"dataGaName":247,"dataGaLocation":466},{"text":252,"config":646},{"href":254,"dataGaName":255,"dataGaLocation":466},{"text":257,"config":648},{"href":259,"dataGaName":260,"dataGaLocation":466},{"text":262,"config":650},{"href":264,"dataGaName":265,"dataGaLocation":466},{"text":267,"config":652},{"href":269,"dataGaName":270,"dataGaLocation":466},{"text":272,"config":654},{"href":274,"dataGaName":275,"dataGaLocation":466},{"title":290,"links":656},[657,659,661,663,665,667,669,673,678,680,682,684],{"text":297,"config":658},{"href":299,"dataGaName":292,"dataGaLocation":466},{"text":302,"config":660},{"href":304,"dataGaName":305,"dataGaLocation":466},{"text":310,"config":662},{"href":312,"dataGaName":313,"dataGaLocation":466},{"text":315,"config":664},{"href":317,"dataGaName":318,"dataGaLocation":466},{"text":320,"config":666},{"href":322,"dataGaName":323,"dataGaLocation":466},{"text":325,"config":668},{"href":327,"dataGaName":328,"dataGaLocation":466},{"text":670,"config":671},"Sustainability",{"href":672,"dataGaName":670,"dataGaLocation":466},"/sustainability/",{"text":674,"config":675},"Diversity, inclusion and belonging (DIB)",{"href":676,"dataGaName":677,"dataGaLocation":466},"/diversity-inclusion-belonging/","Diversity, inclusion and belonging",{"text":330,"config":679},{"href":332,"dataGaName":333,"dataGaLocation":466},{"text":340,"config":681},{"href":342,"dataGaName":343,"dataGaLocation":466},{"text":345,"config":683},{"href":347,"dataGaName":348,"dataGaLocation":466},{"text":685,"config":686},"Modern Slavery Transparency Statement",{"href":687,"dataGaName":688,"dataGaLocation":466},"https://handbook.gitlab.com/handbook/legal/modern-slavery-act-transparency-statement/","modern slavery transparency statement",{"items":690},[691,694,697],{"text":692,"config":693},"Terms",{"href":518,"dataGaName":519,"dataGaLocation":466},{"text":695,"config":696},"Cookies",{"dataGaName":528,"dataGaLocation":466,"id":529,"isOneTrustButton":12},{"text":698,"config":699},"Privacy",{"href":523,"dataGaName":524,"dataGaLocation":466},[701],{"id":702,"title":18,"body":8,"config":703,"content":705,"description":8,"extension":27,"meta":709,"navigation":12,"path":710,"seo":711,"stem":712,"__hash__":713},"blogAuthors/en-us/blog/authors/patrick-steinhardt.yml",{"template":704},"BlogAuthor",{"name":18,"config":706},{"headshot":707,"ctfId":708},"https://res.cloudinary.com/about-gitlab-com/image/upload/v1749661952/Blog/Author%20Headshots/pks-gitlab-headshot.png","pksgitlab",{},"/en-us/blog/authors/patrick-steinhardt",{},"en-us/blog/authors/patrick-steinhardt","SV9Yd_vW69UbvntDP-SEOV9NKT_VwUAj5nfftf2ElSw",[715,726,738],{"content":716,"config":724},{"title":717,"description":718,"authors":719,"heroImage":720,"date":721,"category":9,"tags":722,"body":723},"What’s new in Git 2.54.0?","Learn about release contributions, including new repository maintenance, a new command to edit commit history, a replacement for git-sizer(1), and more.",[18],"https://res.cloudinary.com/about-gitlab-com/image/upload/v1776711651/sj7xxyyuimlarswbyft5.png","2026-04-20",[25,23,260],"The Git project recently released [Git 2.54.0](https://lore.kernel.org/git/xmqqa4uxsjrs.fsf@gitster.g/T/#u). Let's look at a few notable highlights from this release, which includes contributions from the Git team at GitLab.\n\n## Pluggable Object Databases\n\nGit already has the ability to store references with either the \"files\" backend or with the [\"reftable\" backend](https://about.gitlab.com/blog/a-beginners-guide-to-the-git-reftable-format/). This is achieved by having proper abstractions in Git that allows us to have different backends.\n\nBut references are just one of the two important types of data that are stored in repositories, with the other being objects. Objects are stored in the object database, and each object database in turn consists of multiple object sources where objects can be read from or written to. Each object source either stores individual objects as so-called \"loose\" objects, or compresses multiple objects into a \"packfile\" in your `.git/objects` directory.\n\nUntil now, however, these sources did not have a proper abstraction boundary, so the storage format for objects is completely hardcoded into Git. But this is finally changing with pluggable object databases! The concept is straightforward and similar to how we did this for references in the past: Instead of having hardcoded code paths for how to store objects, we introduce an abstraction boundary that allows us to have different backends for storing objects.\n\nWhile the idea is simple, the implementation is not, as we have hardcoded assumptions about the storage formats used in Git all over the place. In fact, we have started working on this topic in Git 2.48, which was released in January 2025. Initially, we focused on making object-related subsystems self-contained and creating proper subsystems for the existing backends that we had in Git.\n\nWith Git 2.54, we have now reached a milestone: The object database backend is now pluggable. Not all of Git's functionality is covered yet, but introducing an alternate backend that handles a meaningful subset of operations is now a realistic undertaking.\n\nFor now, only local workflows like creating commits, showing commit graphs, or performing merges will work with such an alternative implementation. This notably excludes anything that interacts with a remote, such as when you want to fetch or push changes. Regardless, this is the culmination of almost two years of work spanning across almost 400 commits that have been merged upstream, and we will of course continue to iterate on this effort.\n\nSo why does this matter? The idea is that it becomes practical to introduce new storage formats into Git. Examples could be:\n- A storage format that is able to store large binary files more efficiently\n  than packfiles do today\n\n- A storage format that is custom-tailored for GitLab to ensure that we can\n  serve repositories to our users even more efficiently than we currently can\n\n\nThis is a large-scale effort that is likely to shape the future of Git and GitLab.\n\n*This project was led by [Patrick Steinhardt](https://gitlab.com/pks-gitlab).*\n\n## Easier editing of your commit history\n\nIn many software development projects it is common practice for developers to not only polish the code they want to contribute, but to also polish the commit history so that it becomes easy to review. The result is a set of small and atomic commits that each do one thing, with a good commit message that describes the intent of the commit as well as specific nuances.\n\nOf course, more often than not, these atomic commits are not something that just happens naturally during the development process. Instead, the author of the changes will gain a better understanding of what they are while iterating on them, and the way to split up the commits will become clearer over time. Furthermore, the subsequent review process may result in feedback that requires changes to the crafted commits.\n\nThe consequence of this process is that the developer will have to rewrite their commit history many times during the development process. Historically, Git has allowed for this use case via [interactive rebases](https://git-scm.com/docs/git-rebase#_interactive_mode). These interactive rebases are an extremely powerful tool: They let you reorder commits, rewrite commit messages, squash multiple commits together, or perform arbitrary edits of any commit.\n\nBut they are also somewhat arcane and hard to understand. The user needs to figure out the base commit for the rebase, they need to understand how to edit a somewhat obscure \"instruction sheet,\" and they need to be aware of how the stateful rebasing process works. For example, users are presented with an instruction sheet similar to the following when rebasing a topic branch:\n\n```shell\npick b60623f382 # t: detect errors outside of test cases # empty\npick b80cb55882 # t: prepare `test_match_signal ()` calls for `set -e`\npick 5ffe397f30 # t: prepare `test_must_fail ()` for `set -e`\npick 5e9b0cf5e1 # t: prepare `stop_git_daemon ()` for `set -e`\npick 299561e7a2 # t: prepare `git config --unset` calls for `set -e`\npick ed0e7ca2b5 # t: detect errors outside of test cases\n```\n\nSo while interactive rebases are powerful, they are also quite intimidating for the average user.\n\nIt doesn't have to be this way, though. Tools like [Jujutsu](https://www.jj-vcs.dev/latest/) provide interfaces that are much easier to use compared to Git, as you can for example simply execute `jj split` to split up a commit into two commits. With Git and interactive rebases, this use case requires a lot of different steps with confusing command line arguments.\n\nWe have thus taken inspiration from Jujutsu and have introduced a new git-history(1) command into Git that is the foundation for better history editing. For now, this command has two subcommands:\n\n- `git history reword` allows you to easily rewrite a commit message. You simply\n  give it the commit whose message you want to reword, Git asks you for the new\n  commit message, and that's it.\n\n- `git history split` allows you to split up a commit into two, which is\n  inspired by `jj split`. You give it a commit, Git asks you which changes to\n  stage into which commit and for the two commit messages, and then you're done.\n\n\nThis is of course only a start, and we want to add additional subcommands over time. For example:\n\n- `git history fixup` to take staged changes and automatically amend them to a\n  specific commit\n\n- `git history drop` to remove a commit\n- `git history reorder` to reorder the sequence of commits\n- `git history squash` to squash a range of commits\n\nBut that's not all! In addition to making history editing easy, this new command also knows to automatically rebase all of your local branches that previously included this commit. So that means that you can even edit a commit that is not on the current branch, and all branches that contain the commit will be rewritten.\n\nIt may seem puzzling at first that Git is automatically rebasing dependent branches, as that is a significant diversion from how git-rebase(1) works. But this is part of a bigger effort to bring better support for Stacked Diffs to Git, which are a way to create a series of multiple dependent branches that can be reviewed indepentently, but that together work towards a bigger goal.\n\n*This project was led by [Patrick Steinhardt](https://gitlab.com/pks-gitlab) with support from [Elijah Newren](https://github.com/newren).*\n\n## A native replacement for git-sizer(1)\n\nThe size of a Git repository is an important factor that determines how well Git and GitLab can handle it. But size alone is not the only factor, as the performance of a repository is ultimately a combination of multiple different dimensions:\n\n- The depth of the commit history\n- The shape of the directory structure\n- The size of files stored in the repository\n- The number of references\n\nThese are only some of the dimensions one needs to consider when trying to predict whether Git will be able to handle a repository well.\n\nBut while it is clear that the mere repository size is insufficient, Git itself does not provide any tooling that gives the user an easy overview of these metrics. Instead, users are forced to rely on third-party tools like [git-sizer(1)](https://github.com/github/git-sizer) to fill this gap. This tool does an excellent job at surfacing this information, but it is not part of Git itself and thus needs to be installed separately.\n\nObservability of repository internals is critical to us at GitLab, so we introduced a [new `git repo structure` command into Git 2.52](https://about.gitlab.com/blog/whats-new-in-git-2-52-0/#new-subcommand-for-git-repo1-to-display-repository-metrics) to display repository metrics, which we have extended in Git 2.53 to [show inflated and disk sizes for objects by type](https://about.gitlab.com/blog/whats-new-in-git-2-53-0/#more-data-collected-in-git-repo-structure).\n\nIn Git 2.54, we are now iterating some more on this command so that we don't only show the overall size, but also show the largest objects by type:\n\n```shell\n$ git clone https://gitlab.com/git-scm/git.git\n$ cd git\n$ git repo structure\nCounting objects: 410445, done.\n| Repository structure      | Value       |\n| ------------------------- | ----------- |\n| * References              |             |\n|   * Count                 |    1.01 k   |\n|     * Branches            |       1     |\n|     * Tags                |    1.00 k   |\n|     * Remotes             |       9     |\n|     * Others              |       0     |\n|                           |             |\n| * Reachable objects       |             |\n|   * Count                 |  410.45 k   |\n|     * Commits             |   83.99 k   |\n|     * Trees               |  164.46 k   |\n|     * Blobs               |  161.00 k   |\n|     * Tags                |    1.00 k   |\n|   * Inflated size         |    7.46 GiB |\n|     * Commits             |   57.53 MiB |\n|     * Trees               |    2.33 GiB |\n|     * Blobs               |    5.07 GiB |\n|     * Tags                |  737.48 KiB |\n|   * Disk size             |  181.37 MiB |\n|     * Commits             |   33.11 MiB |\n|     * Trees               |   40.58 MiB |\n|     * Blobs               |  107.11 MiB |\n|     * Tags                |  582.67 KiB |\n|                           |             |\n| * Largest objects         |             |\n|   * Commits               |             |\n|     * Maximum size    [1] |   17.23 KiB |\n|     * Maximum parents [2] |      10     |\n|   * Trees                 |             |\n|     * Maximum size    [3] |   58.85 KiB |\n|     * Maximum entries [4] |    1.18 k   |\n|   * Blobs                 |             |\n|     * Maximum size    [5] | 1019.51 KiB |\n|   * Tags                  |             |\n\n|     * Maximum size    [6] |    7.13 KiB |\n\n[1] f6ecb603ff8af608a417d7724727d6bc3a9dbfdf\n[2] 16d7601e176cd53f3c2f02367698d06b85e08879\n[3] 203ee97047731b9fd3ad220faa607b6677861a0d\n[4] 203ee97047731b9fd3ad220faa607b6677861a0d\n[5] aa96f8bc361fd84a1459440f1e7de02ab0dc3543\n[6] 07e38db6a5a03690034d27104401f6c8ea40f1fc\n```\n\nWith this information we're now almost feature-complete as compared to git-sizer(1). We're not done yet, though — we plan to eventually add additional features such as:\n\n- Severity levels as they exist in git-sizer(1)\n- Graphs that show you the distribution of object sizes\n- The ability to scan objects reachable via a subset of references\n\n*This project was led by [Justin Tobler](https://gitlab.com/justintobler).*\n\n## New infrastructure for repository maintenance\n\nWhenever you write data into a Git repository you will typically end up adding more loose objects. Left unmanaged, this leads to a large number of separate files in your `.git/objects/` directory, which slows down several operations that want to access many objects at once. Git thus regularly packs these objects into \"packfiles\" to ensure good performance.\n\nThis isn't the only data structure that may become inefficient over time: Updating references may create loose references, reflogs will need trimming, worktrees may become stale, and caches like commit-graphs need to be refreshed regularly.\n\nAll of these tasks have historically been managed by [git-gc(1)](https://git-scm.com/docs/git-gc). However, this tool has a monolithic architecture, where it basically executes all of the tasks required in sequential order. This foundation is hard to extend and doesn't give the end user much flexibility in case they want to slightly modify how housekeeping is performed.\n\nThe Git project introduced the new [git-maintenance(1)](https://git-scm.com/docs/git-maintenance) tool in Git 2.29. In contrast to git-gc(1), git-maintenance(1) is not monolithic but is instead structured around tasks. These tasks are freely configurable by the user so that the user can control which tasks are running, giving them much more fine-grained control over repository maintenance.\n\nEventually, Git has migrated to use git-maintenance(1) by default. But in the beginning, the only task that was default-enabled was the git-gc(1) task, which as you might have guessed, simply executes `git gc`. To manually run maintenance using this new command you can execute `git maintenance run`, but Git knows to execute this automatically after several other commands.\n\nOver the last couple releases we have implemented all the individual tasks that are supported by git-gc(1) in git-maintenance(1) to ensure that we have feature parity between these two tools.\n\nFurthermore, we have implemented a new task that uses Git's modern architecture for repacking objects with [geometric compaction](https://git-scm.com/docs/git-repack#Documentation/git-repack.txt---geometricfactor).\nGeometric compaction is a much better fit for large monorepos, and with our efforts to make them work well with partial clones [that landed in Git 2.53](https://about.gitlab.com/blog/whats-new-in-git-2-53-0/#geometric-repacking-support-with-promisor-remotes) they are now a full replacement for our previous repacking strategy in Git.\n\nIn Git 2.54, we have now reached another significant milestone: Instead of using the git-gc(1)-based strategy by default, we are now using geometric repacking with fine-grained individual maintenance tasks! Besides being more efficient for large monorepos, it also ensures that we have an easier foundation to iterate on going forward.\n\n*The git-maintenance(1) infrastructure was originally implemented by [Derrick Stolee](https://github.com/derrickstolee) and geometric maintenance was introduced by [Taylor Blau](https://github.com/ttaylorr). The effort to introduce the new fine-grained tasks and migrate to the new maintenance strategy was led by [Patrick Steinhardt](https://gitlab.com/pks-gitlab).*\n\n## Read more\n\nThis article highlighted just a few of the contributions made by GitLab and the wider Git community for this latest release. You can learn about these from the [official release announcement](https://lore.kernel.org/git/xmqqa4uxsjrs.fsf@gitster.g/T/#u) of the Git project. Also, check out our [previous Git release blog posts](https://about.gitlab.com/blog/tags/git/) to see other past highlights of contributions from GitLab team members.",{"slug":725,"featured":31,"template":13},"whats-new-in-git-2-54-0",{"content":727,"config":736},{"title":728,"description":729,"authors":730,"date":732,"body":733,"heroImage":734,"category":9,"tags":735},"What’s new in Git 2.53.0?","Learn about release contributions, including fixes for geometric repacking, updates to git-fast-import(1) commit signature handing options, and more.",[731],"Justin Tobler","2026-02-02","The Git project recently released [Git 2.53.0](https://lore.kernel.org/git/xmqq4inz13e3.fsf@gitster.g/T/#u). Let's look at a few notable highlights from this release, which includes\ncontributions from the Git team at GitLab.\n\n## Geometric repacking support with promisor remotes\n\nNewly written objects in a Git repository are often stored as individual loose files. To ensure good performance and optimal use of disk space, these loose objects are regularly compressed into so-called packfiles. The number of packfiles in a repository grows over time as a result of the user’s activities, like writing new commits or fetching from a remote. As the number of packfiles in a repository increases, Git has to do more work to look up individual objects. Therefore, to preserve optimal repository performance, packfiles are periodically repacked via git-repack(1) to consolidate the objects into fewer packfiles. When repacking there are two strategies: “all-into-one” and “geometric”.\n\nThe all-into-one strategy is fairly straightforward and the current default. As its name implies, all objects in the repository are packed into a single packfile. From a performance perspective this is great for the repository as Git only has to scan through a single packfile when looking up objects. The main downside of such a repacking strategy is that computing a single packfile for a repository can take a significant amount of time for large repositories.\n\nThe geometric strategy helps mitigate this concern by maintaining a geometric progression of packfiles based on their size instead of always repacking into a single packfile. To explain more plainly, when repacking Git maintains a set of packfiles ordered by size where each packfile in the sequence is expected to be at least twice the size of the preceding packfile. If a packfile in the sequence violates this property, packfiles are combined as needed until the progression is restored. This strategy has the advantage of still minimizing the number of packfiles in a repository while also minimizing the amount of work that must be done for most repacking operations.\n\nOne problem with the geometric repacking strategy was that it was not compatible with partial clones. Partial clones allow the user to clone only parts of a repository by, for example, skipping all blobs larger than 1 megabyte. This can significantly reduce the size of a repository, and Git knows how to backfill missing objects that it needs to access at a later point in time.\n\nThe result is a repository that is missing some objects, and any object that may not be fully connected is stored in a “promisor” packfile.  When repacking, this promisor property needs to be retained going forward for packfiles containing a promisor object so it is known whether a missing object is expected and can be backfilled from the promisor remote. With an all-into-one repack, Git knows how to handle promisor objects properly and stores them in a separate promisor packfile. Unfortunately, the geometric repacking strategy did not know to give special treatment to promisor packfiles and instead would merge them with normal packfiles without considering whether they reference promisor objects. Luckily, due to a bug the underlying git-pack-objects(1) dies when using geometric repacking in a partial clone repository. So this means repositories in this configuration were not able to be repacked anyways which isn’t great, but better than repository corruption.\n\nWith the release of Git 2.53, geometric repacking now works with partial clone repositories. When performing a geometric repack, promisor packfiles are handled separately in order to preserve the promisor marker and repacked following a separate geometric progression. With this fix, the geometric strategy moves closer towards becoming the default repacking strategy. For more information check out the corresponding [mailing list thread](https://lore.kernel.org/git/20260105-pks-geometric-repack-with-promisors-v1-0-c4660573437e@pks.im/).\n\nThis project was led by [Patrick Steinhardt](https://gitlab.com/pks-gitlab).\n\n## git-fast-import(1) learned to preserve only valid signatures\n\nIn our [Git 2.52 release article](https://about.gitlab.com/blog/whats-new-in-git-2-52-0/), we covered signature related improvements to git-fast-import(1) and git-fast-export(1). Be sure to check out that post for a more detailed explanation of these commands, how they are used, and the changes being made with regards to signatures.\n\nTo quickly recap, git-fast-import(1) provides a backend to efficiently import data into a repository and is used by tools such as [git-filter-repo(1)](https://github.com/newren/git-filter-repo) to help rewrite the history of a repository in bulk. In the Git 2.52 release, git-fast-import(1) learned the `--signed-commits=\u003Cmode>` option similar to the same option in git-fast-export(1). With this option, it became possible to unconditionally retain or strip signatures from commits/tags.\n\nIn situations where only part of the repository history has been rewritten, any signature for rewritten commits/tags becomes invalid. This means git-fast-import(1) is limited to either stripping all signatures or keeping all signatures even if they have become invalid. But retaining invalid signatures doesn’t make much sense, so rewriting history with git-repo-filter(1) results in all signatures being stripped, even if the underlying commit/tag is not rewritten. This is unfortunate because if the commit/tag is unchanged, its signature is still valid and thus there is no real reason to strip it. What is really needed is a means to preserve signatures for unchanged objects, but strip invalid ones.\n\nWith the release of Git 2.53, the git-fast-import(1) `--signed-commits=\u003Cmode>` option has learned a new `strip-if-invalid` mode which, when specified, only strips signatures from commits that become invalid due to being rewritten. Thus, with this option it becomes possible to preserve some commit signatures when using git-fast-import(1). This is a critical step towards providing the foundation for tools like git-repo-filter(1) to preserve valid signatures and eventually re-sign invalid signatures.\n\nThis project was led by [Christian Couder](https://gitlab.com/chriscool).\n\n## More data collected in git-repo-structure\n\nIn the Git 2.52 release, the “structure” subcommand was introduced to git-repo(1). The intent of this command was to collect information about the repository and eventually become a native replacement for tools such as [git-sizer(1)](https://github.com/github/git-sizer). At GitLab, we host some extremely large repositories, and having insight into the general structure of a repository is critical to understand its performance characteristics. In this release, the command now also collects total size information for reachable objects in a repository to help understand the overall size of the repository. In the output below, you can see the command now collects both the total inflated and disk sizes of reachable objects by object type.\n\n```shell\n$ git repo structure\n\n| Repository structure | Value      |\n| -------------------- | ---------- |\n| * References         |            |\n|   * Count            |   1.78 k   |\n|     * Branches       |      5     |\n|     * Tags           |   1.03 k   |\n|     * Remotes        |    749     |\n|     * Others         |      0     |\n|                      |            |\n| * Reachable objects  |            |\n|   * Count            | 421.37 k   |\n|     * Commits        |  88.03 k   |\n|     * Trees          | 169.95 k   |\n|     * Blobs          | 162.40 k   |\n|     * Tags           |    994     |\n|   * Inflated size    |   7.61 GiB |\n|     * Commits        |  60.95 MiB |\n|     * Trees          |   2.44 GiB |\n|     * Blobs          |   5.11 GiB |\n|     * Tags           | 731.73 KiB |\n|   * Disk size        | 301.50 MiB |\n|     * Commits        |  33.57 MiB |\n|     * Trees          |  77.92 MiB |\n|     * Blobs          | 189.44 MiB |\n|     * Tags           | 578.13 KiB |\n```\n\nThe keen-eyed among you may have also noticed that the size values in the table output are also now listed in a more human-friendly manner with units appended. In subsequent releases we hope to further expand this command's output to provide additional data points such as the largest individual objects in the repository.\n\nThis project was led by [Justin Tobler](https://gitlab.com/justintobler).\n\n## Read more\n\nThis article highlighted just a few of the contributions made by GitLab and\nthe wider Git community for this latest release. You can learn about these from\nthe [official release announcement](https://lore.kernel.org/git/xmqq4inz13e3.fsf@gitster.g/T/#u) of the Git project. Also, check\nout our [previous Git release blog posts](https://about.gitlab.com/blog/tags/git/)\nto see other past highlights of contributions from GitLab team members.","https://res.cloudinary.com/about-gitlab-com/image/upload/v1749663087/Blog/Hero%20Images/git3-cover.png",[25,23,260],{"featured":12,"template":13,"slug":737},"whats-new-in-git-2-53-0",{"content":739,"config":748},{"title":740,"description":741,"authors":742,"heroImage":734,"date":745,"body":746,"category":9,"tags":747},"What’s new in Git 2.52.0?","Learn about release contributions, including the new git-last-modified(1) command, improvements to history-rewriting tools, and a new maintenance strategy.",[743,744,18],"Christian Couder","Toon Claes","2025-11-17","The Git project recently released [Git 2.52](https://lore.kernel.org/git/xmqqh5usmvsd.fsf@gitster.g/). After a relatively short 8-week [release cycle for 2.51](https://about.gitlab.com/blog/what-s-new-in-git-2-51-0/), due to summer in the Northern Hemisphere, this release is back to the usual 12-week cycle. Let’s look at some notable changes, including contributions from the GitLab Git team and the wider Git community.\n\n## New git-last-modified(1) command\n\nMany Git forges like GitLab display files in a tree view like this:\n\n\n| Name        | Last commit                                             | Last update  |\n| ------------- | --------------------------------------------------------- | -------------- |\n| README.md   | README: *.txt -> *.adoc fixes                           | 4 months ago |\n| RelNotes    | Start 2.51 cycle, the first batch                       | 4 weeks ago  |\n| SECURITY.md | SECURITY: describe how to report vulnerabilities        | 4 years      |\n| abspath.c   | abspath: move related functions to abspath              | 2 years      |\n| abspath.h   | abspath: move related functions to abspath              | 2 years      |\n| aclocal.m4  | configure: use AC_LANG_PROGRAM consistently             | 15 years ago |\n| add-patch.c | pager: stop using `the_repository`                      | 7 months ago |\n| advice.c    | advice: allow disabling default branch name advice      | 4 months ago |\n| advice.h    | advice: allow disabling default branch name advice      | 4 months ago |\n| alias.h     | rebase -m: fix serialization of strategy options        | 2 years      |\n| alloc.h     | git-compat-util: move alloc macros to git-compat-util.h | 2 years ago  |\n| apply.c     | apply: only write intents to add for new files          | 8 days ago   |\n| archive.c   | Merge branch 'ps/parse-options-integers'                | 3 months ago |\n| archive.h   | archive.h: remove unnecessary include                   | 1 year       |\n| attr.h      | fuzz: port fuzz-parse-attr-line from OSS-Fuzz           | 9 months ago |\n| banned.h    | banned.h: mark `strtok()` and `strtok_r()` as banned    | 2 years      |\n\n\n\u003Cbr>\u003C/br>\n\nNext to the files themselves, we also display which commit last modified each respective file. This information is easy to extract from Git by executing the following command:\n\n\n```shell\n\n$ git log --max-count=1 HEAD -- \u003Cfilename>\n\n```\n\nWhile nice and simple, this has a significant catch: Git does not have a way to extract this information for each of these files in a single command. So to get the last commit for all the files in the tree, we'd need to run this command for each file separately. This results in a command pipeline similar to the following:\n\n```shell\n\n$ git ls-tree HEAD --name-only | xargs --max-args=1 git log --max-count=1 HEAD --\n\n```\n\nNaturally, this isn't very efficient:\n\n\n* We need to spin up a fresh Git command for each file.\n\n\n* Git has to step through history for each file separately.\n\n\n\nAs a consequence, this whole operation is quite costly and generates significant load for GitLab.\n\n\n\nTo overcome these issues, a new Git subcommand `git-last-modified(1)` has been introduced. This command returns the commit for each file of a given commit:\n\n\n\n```shell\n\n$ git last-modified HEAD\n\n\ne56f6dcd7b4c90192018e848d0810f091d092913        add-patch.c\n373ad8917beb99dc643b6e7f5c117a294384a57e        advice.h\ne9330ae4b820147c98e723399e9438c8bee60a80        advice.c\n5e2feb5ca692c5c4d39b11e1ffa056911dd7dfd3        alloc.h\n954d33a9757fcfab723a824116902f1eb16e05f7        RelNotes\n4ce0caa7cc27d50ee1bedf1dff03f13be4c54c1f        apply.c\n5d215a7b3eb0a9a69c0cb9aa43dcae956a0aa03e        archive.c\nc50fbb2dd225e7e82abba4380423ae105089f4d7        README.md\n72686d4e5e9a7236b9716368d86fae5bf1ae6156        attr.h\nc2c4138c07ca4d5ffc41ace0bfda0f189d3e262e        archive.h\n5d1344b4973c8ea4904005f3bb51a47334ebb370        abspath.c\n5d1344b4973c8ea4904005f3bb51a47334ebb370        abspath.h\n60ff56f50372c1498718938ef504e744fe011ffb        banned.h\n4960e5c7bdd399e791353bc6c551f09298746f61        alias.h\n2e99b1e383d2da56c81d7ab7dd849e9dab5b7bf0        SECURITY.md\n1e58dba142c673c59fbb9d10aeecf62217d4fc9c        aclocal.m4\n```\n\n\n\nThe benefit of this is obviously that we only have to execute a single Git process now to derive all of that information. But even more importantly, it only requires us to walk the history once for all files together instead of having to walk it multiple times. This is achieved by:\n\n\n1. Start walking the history from the specified commit.\n\n\n2. For each commit:\n   1. If it doesn't modify any of the paths we're interested in we continue to the next commit.\n   2. If it does, we print the commit ID together with the path. Furthermore, we remove the path from the set of interesting paths.\n3. When the list of interesting paths becomes empty we stop.\n\n\n\nGitaly has already been adjusted to use the new command, but the logic is still guarded by a feature flag. Preliminary testing has shown that `git-last-modified(1)` is in most situations at least twice as fast compared to using `git log --max-count=1`.\n\n\n\n*These changes were [originally written](https://github.com/ttaylorr/git/tree/tb/blame-tree) by multiple developers from GitHub and were [upstreamed](https://lore.kernel.org/git/20250805093358.1791633-1-toon@iotcl.com/) into Git by [Toon Claes](https://gitlab.com/toon).*\n\n\n\n## git-fast-export(1) and git-fast-import(1) signature-related improvements\n\n\n\nThe `git-fast-export(1)` and `git-fast-import(1)` commands are designed to be mostly used by interoperability or history rewriting tools. The goal of interoperability tools is to make Git interact nicely with other software, usually a different version control system, that stores data in a different format than Git. For example [hg-fast-export.sh](https://github.com/frej/fast-export) is a “Mercurial to Git converter using git-fast-import.\"\n\n\n\nAlternately, history-rewriting tools let users — usually admins — make changes to the history of their repositories that are not possible or not allowed by the version control system. For example, [reposurgeon](http://www.catb.org/esr/reposurgeon/) says in its [introduction](https://gitlab.com/esr/reposurgeon/-/blob/master/repository-editing.adoc?ref_type=heads#introduction) that its purpose is “to enable risky operations that version-control systems don't want to let you do, such as (a) editing past comments and metadata, (b) excising commits, (c) coalescing and splitting commits, (d) removing files and subtrees from repo history, (e) merging or grafting two or more repos, and (f) cutting a repo in two by cutting a parent-child link, preserving the branch structure of both child repos.\"\n\n\n\nWithin GitLab, we use [git-filter-repo](https://github.com/newren/git-filter-repo) to let admins perform some risky operations on their Git repositories. Unfortunately, until Git 2.50 (released last June), both `git-fast-export(1)` and `git-fast-import(1)` didn't handle cryptographic commit signatures at all. So, although `git-fast-export(1)` had a `--signed-tags=\u003Cmode>` option that allows users to change how cryptographic tag signatures are handled, commit signatures were simply ignored.\n\n\n\nCryptographic signatures are very fragile because they are based on the exact commit or tag data that was signed. When the signed data or any of its preceding history changes, the cryptographic signature becomes invalid. This is a fragile but necessary requirement to make these signatures useful.\n\n\n\nBut in the context of rewriting history this is a problem:\n\n\n\n* We may want to keep cryptographic signatures for both commits and tags that are still valid after the rewrite (e.g. because the history leading up to them did not change).\n\n\n* We may want to create new cryptographic signatures for commits and tags where the previous signature has become invalid.\n\n\n\nNeither `git-fast-import(1)` nor `git-fast-export(1)` allow for these use cases though, which limits what tools like [git-filter-repo](https://github.com/newren/git-filter-repo) or [reposurgeon](http://www.catb.org/esr/reposurgeon/) can achieve.\n\n\n\nWe have made some significant progress:\n\n\n\n* In Git 2.50 we added a `--signed-commits=\u003Cmode>` option to `git-fast-export(1)` for exporting commit signatures, and support in `git-fast-import(1)` for importing them.\n\n\n* In Git 2.51 we improved the format used for exporting and importing commit signatures, and we made it possible for `git-fast-import(1)` to import both a signature made on the SHA-1 object ID of the commit and one made on its SHA-256 object ID.\n\n\n* In Git 2.52 we added the `--signed-commits=\u003Cmode>` and `--signed-tags=\u003Cmode>` options to `git-fast-import(1)`, so the user has control over how to handle signed data at import time.\n\n\n\nThere is still more to be done. We need to add the ability to:\n\n\n\n* Retain only those commit signatures that are still valid to `git-fast-import(1)`.\n\n\n* Re-sign data where the signature became invalid.\n\n\n\nWe have already started to work on these next steps and expect this to land in Git 2.53. Once done, tools like `git-filter-repo(1)` will finally start to handle cryptographic signatures more gracefully. We will keep you posted in our next Git release blog post.\n\n\n\n*This project was led by [Christian Couder](https://gitlab.com/chriscool).*\n\n\n\n## New and improved git-maintenance(1) strategies\n\n\n\nGit repositories require regular maintenance to ensure that they perform well. This maintenance performs a bunch of different tasks: references get optimized, objects get compressed, and stale data gets pruned.\n\n\n\nUntil Git 2.28, these maintenance tasks were performed by `git-gc(1)`. The problem with this command is that it wasn't built with customizability in mind: While certain parameters can be configured, it is not possible to control which parts of a repository should be optimized. This means that it may not be a good fit for all use cases. Even more importantly, it made it very hard to iterate on how exactly Git performs repository maintenance.\n\n\n\nTo fix this issue and allow us to iterate again, [Derrick Stolee](https://github.com/derrickstolee) introduced `git-maintenance(1)`. In contrast to `git-gc(1),` it is built with customizability in mind and allows the user to configure which tasks specifically should be running in a certain repository. This new tool was made the default for Git’s automated maintenance in Git 2.29, but, by default, it still uses `git-gc(1)` to perform the maintenance.\n\n\n\nWhile this default maintenance strategy works well in small or even medium-sized repositories, it is problematic in the context of large monorepos. The biggest limiting factor is how `git-gc(1)` repacks objects: Whenever there are more than 50 packfiles, the tool will merge all of them together into a single packfile. This operation is quite CPU-intensive and causes a lot of I/O operations, so for large monorepos this operation can easily take many minutes or even hours to complete.\n\n\n\nGit already knows how to minimize these repacks via “geometric repacking.” The idea is simple: The packfiles that exist in the repository must follow a geometric progression where every packfile must contain at least twice as many objects as the next smaller one. This allows Git to amortize the number of repacks required while still ensuring that there is only a relatively small number of packfiles overall. This mode was introduced by [Taylor Blau](https://github.com/ttaylorr) in Git 2.32, but it was not wired up as part of the automated maintenance.\n\n\n\nAll the parts exist to make repository maintenance way more scalable for large monorepos: We have the flexible `git-maintenance(1)` tool that can be extended to have a new maintenance strategy, and we have a better way to repack objects. All that needs to be done is to combine these two.\n\n\n\nAnd that's exactly what we did with Git 2.52! We have introduced a new “geometric” maintenance strategy that you can configure in your Git repositories. This strategy is intended as a full replacement for the old strategy based on `git-gc(1)`. Here is the config code you need:\n\n\n\n```shell\n\n$ git config set maintenance.strategy geometric\n\n```\n\n\n\nFrom hereon, Git will use geometric repacking to optimize your objects. This should lead to less churn while ensuring that your objects are in a better-optimized state, especially in large monorepos.\n\n\n\nIn Git 2.53, we aim to make this the default strategy. So stay tuned!\n\n\n\n*This project was led by [Patrick Steinhardt](https://gitlab.com/pks-gitlab).*\n\n\n\n## New subcommand for git-repo(1) to display repository metrics\n\n\n\nPerformance of Git operations in a repository are often dependent on certain characteristics of its underlying structure. At GitLab, we host some extremely large repositories and having insight into the general structure of a repository is critical to understand performance. While it is possible to compose various Git commands and other tools together to surface certain repository metrics, Git lacks a means to surface info about a repository's shape/structure via a single command. This has led to the development of other external tools, such as [git-sizer(1)](https://github.com/github/git-sizer), to fill this gap.\n\n\n\nWith the release of Git 2.52, a new “structure” subcommand has been added to git-repo(1) with the aim to surface information about a repository's structure. Currently, it displays info about the number of references and objects in the repository in the following form:\n\n\n\n```shell\n\n$ git repo structure\n\n\n| Repository structure | Value  |\n| -------------------- | ------ |\n| * References         |        |\n|   * Count            |   1772 |\n|     * Branches       |      3 |\n|     * Tags           |   1025 |\n|     * Remotes        |    744 |\n|     * Others         |      0 |\n|                      |        |\n| * Reachable objects  |        |\n|   * Count            | 418958 |\n|     * Commits        |  87468 |\n|     * Trees          | 168866 |\n|     * Blobs          | 161632 |\n|     * Tags           |    992 |\n\n```\n\n\n\nIn subsequent releases we hope to expand on this and provide other interesting data points like the largest objects in the repository.\n\n\n\n*This project was led by [Justin Tobler](https://gitlab.com/justintobler).*\n\n\n\n## Improvements related to the Google Summer of Code 2025\n\n\n\nWe had three successful projects with the Google Summer of Code.\n\n\n\n### Refactoring in order to reduce Git's global state\n\n\n\nGit contains several global variables used throughout the codebase. This increases the complexity of the code and reduces the maintainability. As part of this project, [Ayush Chandekar](https://ayu-ch.github.io/) worked on reducing the usage of the `the_repository` global variable via a series of patches.\n\n\n\n*The project was mentored by [Christian Couder](https://gitlab.com/chriscool) and [Ghanshyam Thakkar](https://in.linkedin.com/in/ghanshyam-thakkar).*\n\n\n\n### Machine-readable Repository Information Query Tool\n\n\n\nGit lacks a centralized way to retrieve repository information, requiring users to piece it together from various commands. While `git-rev-parse(1)` has become the de-facto tool for accessing much of this information, doing so falls outside its primary purpose.\n\n\n\nAs part of this project, [Lucas Oshiro](https://lucasoshiro.github.io/en/) introduced a new command, `git-repo(1),` which will house all repository-level information. Users can now use `git repo info` to obtain repository information:\n\n\n\n```shell\n\n$ git repo info layout.bare layout.shallow object.format references.format\n\nlayout.bare=false\nlayout.shallow=false\nobject.format=sha1\nreferences.format=reftable\n\n```\n\n\n\n*The project was mentored by [Patrick Steinhardt](https://gitlab.com/pks-gitlab) and [Karthik Nayak](https://gitlab.com/knayakgl).*\n\n\n\n### Consolidate ref-related functionality into git-refs\n\n\n\nGit offers multiple commands for managing references, namely `git-for-each-ref(1)`, `git show-ref(1)`, `git-update-ref(1)`, and `git-pack-refs(1)`. This makes them harder to discover and creates overlapping functionality. To address this, we introduced the `git-refs(1)` command to consolidate these operations under a single interface. As part of this this project, [Meet Soni](https://inosmeet.github.io/) extended the command by adding the following subcommands:\n\n\n\n* `git refs optimize` to optimize the reference backend\n\n\n* `git refs list` to list all references\n\n\n* `git refs exists` to verify the existence of a reference\n\n\n\n*The project was mentored by [Patrick Steinhardt](https://gitlab.com/pks-gitlab) and [shejialuo](https://luolibrary.com/).*\n\n\n\n## What's next?\n\n\n\nReady to experience these improvements? Update to Git 2.52.0 and start using `git last-modified`.\n\n\n\nAt GitLab, we will of course ensure that all of these improvements will eventually land in a GitLab instance near you!\n\n\n\nLearn more in the [official Git 2.52.0 release notes](https://lore.kernel.org/git/xmqqh5usmvsd.fsf@gitster.g/) and explore our [complete archive of Git development coverage](https://about.gitlab.com/blog/tags/git/).\n",[25,23,260],{"featured":12,"template":13,"slug":749},"whats-new-in-git-2-52-0",{"promotions":751},[752,766,778,790],{"id":753,"categories":754,"header":756,"text":757,"button":758,"image":763},"ai-modernization",[755],"ai-ml","Is AI achieving its promise at scale?","Quiz will take 5 minutes or less",{"text":759,"config":760},"Get your AI maturity score",{"href":761,"dataGaName":762,"dataGaLocation":242},"/assessments/ai-modernization-assessment/","modernization assessment",{"config":764},{"src":765},"https://res.cloudinary.com/about-gitlab-com/image/upload/v1772138786/qix0m7kwnd8x2fh1zq49.png",{"id":767,"categories":768,"header":770,"text":757,"button":771,"image":775},"devops-modernization",[769,568],"product","Are you just managing tools or shipping innovation?",{"text":772,"config":773},"Get your DevOps maturity score",{"href":774,"dataGaName":762,"dataGaLocation":242},"/assessments/devops-modernization-assessment/",{"config":776},{"src":777},"https://res.cloudinary.com/about-gitlab-com/image/upload/v1772138785/eg818fmakweyuznttgid.png",{"id":779,"categories":780,"header":782,"text":757,"button":783,"image":787},"security-modernization",[781],"security","Are you trading speed for security?",{"text":784,"config":785},"Get your security maturity score",{"href":786,"dataGaName":762,"dataGaLocation":242},"/assessments/security-modernization-assessment/",{"config":788},{"src":789},"https://res.cloudinary.com/about-gitlab-com/image/upload/v1772138786/p4pbqd9nnjejg5ds6mdk.png",{"id":791,"paths":792,"header":795,"text":796,"button":797,"image":802},"github-azure-migration",[793,794],"migration-from-azure-devops-to-gitlab","integrating-azure-devops-scm-and-gitlab","Is your team ready for GitHub's Azure move?","GitHub is already rebuilding around Azure. Find out what it means for you.",{"text":798,"config":799},"See how GitLab compares to GitHub",{"href":800,"dataGaName":801,"dataGaLocation":242},"/compare/gitlab-vs-github/github-azure-migration/","github azure migration",{"config":803},{"src":777},{"header":805,"blurb":806,"button":807,"secondaryButton":812},"Start building faster today","See what your team can do with the intelligent orchestration platform for DevSecOps.\n",{"text":808,"config":809},"Get your free trial",{"href":810,"dataGaName":49,"dataGaLocation":811},"https://gitlab.com/-/trial_registrations/new?glm_content=default-saas-trial&glm_source=about.gitlab.com/","feature",{"text":504,"config":813},{"href":53,"dataGaName":54,"dataGaLocation":811},1777302583752]