[{"data":1,"prerenderedAt":815},["ShallowReactive",2],{"/en-us/blog/moving-all-your-data":3,"navigation-en-us":34,"banner-en-us":445,"footer-en-us":455,"blog-post-authors-en-us-Jacob Vosmaer":697,"blog-related-posts-en-us-moving-all-your-data":711,"blog-promotions-en-us":752,"next-steps-en-us":805},{"id":4,"title":5,"authorSlugs":6,"authors":8,"body":10,"category":11,"categorySlug":11,"config":12,"content":16,"date":20,"description":17,"extension":21,"externalUrl":22,"featured":14,"heroImage":19,"isFeatured":14,"meta":23,"navigation":24,"path":25,"publishedDate":20,"rawbody":26,"seo":27,"slug":13,"stem":31,"tagSlugs":32,"tags":22,"template":15,"updatedDate":22,"__hash__":33},"blogPosts/en-us/blog/moving-all-your-data.yml","Moving all your data, 9TB edition",[7],"jacob-vosmaer",[9],"Jacob Vosmaer","At GitLab B.V. we are working on an infrastructure upgrade to give more CPU\npower and storage space to GitLab.com. (We are currently still running on a\n[single server](/blog/the-hardware-that-powers-100k-git-repos/).) As a\npart of this upgrade we wanted to move gitlab.com from our own dedicated\nhardware servers to an AWS data center 400 kilometers away.  In this blog post\nI will tell you how I did that and what challenges I had to overcome. An epic\nadventure of hand-rolled network tunnels, advanced DRBD features and streaming\n9TB of data through SSH pipes!\n\n\u003C!-- more -->\n\n## What did I have to move?\n\nIn our current setup we run a stock GitLab Enterprise Edition omnibus package,\nwith a single big filesystem mounted at `/var/opt/gitlab`. This\nfilesystem holds all the user data hosted on gitlab.com: Postgres and Redis\ndatabase files, user uploads, and a lot of Git repositories. All I had to do\nto move this data to AWS is to move the files on this filesystem. Sounds simple\nenough, does it not?\n\nSo do we move the files, or the filesystem itself? This is an easy question to\nanswer. Moving the files using something like Rsync is not an option because it\nis just too slow. We do file-based backups every week where we take a block\ndevice snapshot, mount the snapshot and send it across with Rsync. That\ncurrently takes over 24 hours, and 24 hours of downtime while we move\ngitlab.com is not a nice idea. Now you might ask: what if you Rsync once to\nprepare, take the server offline, and then do a quick Rsync just to catch up?\nThat would still take hours just for Rsync to walk through all the files and\ndirectories on disk. No good.\n\nWe have faced and solved this same problem in the past when the amount of data\nwas 5 times smaller. (Rsync was not an option even then.) What I did at that\ntime was to use DRBD to move not just the files themselves, but the whole\nfilesystem they sit on. This time around DRBD again seemed like the best\nsolution for us. It is not the fastest solution to move a lot of data, but what\nis great about it is that you can keep using the filesystem while the data is\nbeing moved, and changes will get synchronized continuously. No downtime for\nour users! (Except maybe 5 minutes at the start to set up the sync.)\n\n## What is DRBD?\n\n[DRBD](http://www.drbd.org) is a system that can create a virtual hard drive\n(block device) on a Linux computer that gets mirrored across a network\nconnection to a second Linux computer. Both computers give a 'real' hard drive\nto DRBD, and DRBD keeps the contents of the real hard drive the same across\nboth computers via the network. One of the two computers gets a virtual hard\ndrive from DRBD, which shows the contents of the real hard drive underneath. If\nyour first computer crashes, you can 'plug in' the virtual hard drive on the\nsecond computer in a matter of seconds, and all your data will still be there\nbecause DRBD kept the 'real' hard drives in sync for you. You can even have the\ntwo computers that are linked by DRBD sit in different buildings, or on\ndifferent continents. Up until our move to AWS, we were using DRBD to protect\nagainst hardware failure on the server that runs gitlab.com: if such a failure\nwould happen, we could just plug in the virtual hard drive with the user data\ninto our stand-by server. In our new data center, the hosting provider (Amazon\nWeb Services) has their own solution for plugging virtual hard drives in and\nout called Elastic Block Storage, so we are no longer using DRBD as a virtual\nhard drive. From an availability standpoint this is not better or worse, but\nusing EBS drives does make it a lot easier for us to make backups because now\nwe can just store snapshots (no more Rsync).\n\n## Using DRBD for a data migration\n\nAlthough DRBD is not really made for this purpose, I felt confident using DRBD\nfor the migration because I had done it before for a migration between data\ncenters. At that time we were moving across the Atlantic Ocean; this time we\nwould only be moving from the Netherlands to Germany.  However, the last time\nwe used DRBD only as a one-off tool. In our pre-migration setup, we were\nalready using DRBD to replicate the filesystem between two servers in the same\nrack. DRBD only lets you share a virtual hard drive between two computers, so\nhow do we now send the data to a _third_ computer in the new data center?\n\nLuckily, DRBD actually has a trick up its sleeve to deal with this, called\n'stacked resources'. This means that our old servers ('linus' and 'monty')\nwould share a virtual hard drive called 'drbd0', and that whoever of the two\nhas the 'drbd0' virtual hard drive plugged in gets to use 'drbd0' as the 'real'\nhard drive underneath a second virtual hard drive, called 'drbd10', which is\nshared with the new server ('theo'). Also see the picture below.\n\n![Stacked DRBD replication](https://about.gitlab.com/images/drbd/drbd-three-nodes.png)\n\nIf linus would malfunction, we could attach drbd0 (the blue virtual hard drive)\non monty and keep gitlab.com going. The 'green' replication (to get the data to\ntheo) would also be able to continue, even after a failover to monty.\n\n## Networking\n\nI liked the picture above, so 'all' I had to do was set it up. That ended up\ntaking a few days, just to set up a test environment, and to figure out how to\ncreate a network tunnel for the green traffic. The network tunnel needed to\nhave a movable endpoint depending on whether linus or monty was primary. We\nalso needed the tunnel because DRBD is not compatible with the [Network Address\nTranslation](http://en.wikipedia.org/wiki/Network_address_translation) used by\nAWS. DRBD assumes that whenever a node listens on an IP address, it is also\nreachable for its partner node at that IP address. On AWS on the other hand, a\nnode will have one or more internal IP addresses, which are distinct from its\n_public_ IP address.\n\nWe chose to work around this with an [IPIP\ntunnel](http://en.wikipedia.org/wiki/IP_in_IP) and manually keyed IPsec\nencryption. Previous experiments indicated that this gave us the best network\nthroughput compared to OpenVPN and GRE tunnels.\n\nTo set up the tunnel I used a shell script that was kept in sync on all three\nservers involved in the migration by Chef.\n\n```bash\n# Network tunnel configuration script used by GitLab B.V. to migrate data from\n# Delft to Frankfurt\n\n#!/bin/sh\nset -u\n\nPATH=/usr/sbin:/sbin:/usr/bin:/bin\n\nfrankfurt_public=54.93.71.23\nfrankfurt_replication=172.16.228.2\ntest_public=54.152.127.180\ntest_replication=172.16.228.1\ndelft_public=62.204.93.103\ndelft_replication=172.16.228.1\n\ncreate_tunipip() {\n  if ! ip tunnel show | grep -q tunIPIP ; then\n    echo Creating tunnel tunIPIP\n    ip tunnel add tunIPIP mode ipip ttl 64 local \"$1\" remote \"$2\"\n  fi\n}\n\nadd_tunnel_address() {\n  if ! ip address show tunIPIP | grep -q \"$1\" ; then\n    ip address add \"$1/32\" peer \"$2/32\" dev tunIPIP\n  fi\n}\n\ncase $(hostname) in\n  ip-10-0-2-9)\n    create_tunipip 10.0.2.140 \"${frankfurt_public}\"\n    add_tunnel_address \"${test_replication}\" \"${frankfurt_replication}\"\n    ip link set tunIPIP up\n    ;;\n  ip-10-0-2-245)\n    create_tunipip 10.0.2.11 \"${frankfurt_public}\"\n    add_tunnel_address \"${test_replication}\" \"${frankfurt_replication}\"\n    ip link set tunIPIP up\n    ;;\n  ip-10-1-0-52|theo.gitlab.com)\n    create_tunipip 10.1.0.52 \"${delft_public}\"\n    add_tunnel_address \"${frankfurt_replication}\" \"${delft_replication}\"\n    ip link set tunIPIP up\n    ;;\n  linus|monty)\n    create_tunipip \"${delft_public}\" \"${frankfurt_public}\"\n    add_tunnel_address \"${delft_replication}\" \"${frankfurt_replication}\"\n    ip link set tunIPIP up\n    ;;\nesac\n```\n\nThis script was configured to run on boot. Note that it covers our Delft nodes\n(linus and monty, then current production), the node we were migrating to in\nFrankfurt (theo), and two AWS test nodes that were part of a staging setup. We\nchose the AWS Frankfurt (Germany) data center because of its geographic\nproximity to Delft (The Netherlands).\n\nWe configured IPsec with `/etc/ipsec-tools.conf`. An example for the 'origin'\nconfiguration would be:\n\n```text\n#!/usr/sbin/setkey -f\n\n# Configuration for 172.16.228.1\n\n# Flush the SAD and SPD\nflush;\nspdflush;\n\n# Attention: Use this keys only for testing purposes!\n# Generate your own keys!\n\n# AH SAs using 128 bit long keys\n# Fill in your keys below!\nadd 172.16.228.1 172.16.228.2 ah 0x200 -A hmac-md5 0xfoobar;\nadd 172.16.228.2 172.16.228.1 ah 0x300 -A hmac-md5 0xbarbaz;\n\n# ESP SAs using 192 bit long keys (168 + 24 parity)\n# Fill in your keys below!\nadd 172.16.228.1 172.16.228.2 esp 0x201 -E 3des-cbc 0xquxfoo;\nadd 172.16.228.2 172.16.228.1 esp 0x301 -E 3des-cbc 0xbazqux;\n\n# Security policies\n# outbound traffic from 172.16.228.1 to 172.16.228.2\nspdadd 172.16.228.1 172.16.228.2 any -P out ipsec esp/transport//require ah/transport//require;\n\n# inbound traffic from 172.16.228.2 to 172.16.228.1\nspdadd 172.16.228.2 172.16.228.1 any -P in ipsec esp/transport//require ah/transport//require;\n```\n\nGetting the networking to this point took quite some work. For starters, we did\nnot have a staging environment similar enough to our production environment, so\nI had to create one for this occasion.\n\nOn top of that, to model our production setup, I had to use an AWS 'Virtual\nPrivate Cloud', which was new technology for us. It took a while before I\nfound some [vital information about using multiple IP\naddresses](http://engineering.silk.co/post/31923247961/multiple-ip-addresses-on-amazon-ec2)\nthat was not obvious from the AWS documentation: if you want to have two public\nIP addresses on an AWS VPC node, you need to put two corresponding private IP\naddresses on one 'Elastic Network Interface', instead of creating two network\ninterfaces with one private IP each.\n\n## Configuring three-way DRBD replication\n\nWith the basic networking figured out the next thing I had to do was to adapt\nour production failover script so that we maintain redundancy while migrating\nthe data. 'Failover' is a procedure where you move a service (gitlab.com) ove\nto a different computer after a failure. Our failover procedure is managed by a\nscript. My goal was to make sure that if one of our production servers failed,\nany teammate of mine on pager duty would be able to restore the gitlab.com\nservice using our normal failover procedure. That meant I had to update the\nscript to use the new three-way DRBD configuration.\n\nI certainly got a little more familiar with tcpdump (`tcpdump -n -i\nINTERFACE`), having multiple layers of firewalls\n([UFW](http://en.wikipedia.org/wiki/Uncomplicated_Firewall) and AWS [Security\nGroups](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-network-security.html)),\nand how to get any useful log messages from DRBD:\n\n```shell\n# Monitor DRBD log messages\nsudo tail -f /var/log/messages | grep -e drbd -e d-con\n```\n\nI later learned that I actually deployed a new version of the failover script\nwith a bug in it that potentially could have confused the hell out of my\nteammates had they had to use it under duress. Luckily we never actually needed\nthe failover procedure during the time the new script was in production.\n\nBut, even though I was introducing new complexity and hence bugs into our\nfailover tooling, I did manage to learn and try out enough things to bring this\nproject to a successful conclusion.\n\n## Enabling the DRBD replication\n\nThis part was relatively easy. I just had to grow the DRBD block device\n'drbd0' so that it could accommodate the new stacked (inner) block device\n'drbd10' without having to shrink our production filesystem. Because drbd0 was\nbacked by LVM and we had some space left this was a matter of invoking\n`lvextend` and `drbdadm resize` on both our production nodes.\n\nThe step after this was the first one where I had to take gitlab.com offline.\nIn order to 'activate' drbd10 and start the synchronization, I had to unmount\n`/dev/drbd0` from `/var/opt/gitlab` and mount `/dev/drbd10` in its place. This\ntook less than 5 minutes. After this the actual migration was under way!\n\n## Too slow\n\nAt this point I was briefly excited to be able to share some good news with the\nrest of the team. While staring about the DRBD progress bar for the\nsynchronization I started to realize however that the progress bar was telling\nme that the synchronization would take about 50-60 days at 2MB/s.\n\nThis prognosis was an improvement over what we would expect based on our\nprevious experience moving 1.8TB from North Virginia (US) to Delft (NL) in\nabout two weeks (across the Atlantic Ocean!). If one would extrapolate that\nrate you would expect moving 9TB to take 70 days. We were disappointed\nnonetheless because we were hoping that we would gain more throughput by moving\nover a shorter distance this time around (Delft and Frankfurt are about 400km\napart).\n\nThe first thing I started looking into at this point was whether we could\nsomehow make better use of the network bandwidth at our disposal. Sending fake\ndata (zeroes) over the (encrypted) IPIP tunnel (`dd if=/dev/zero | nc remote_ip\n1234`) we could get about 17 MB/s. By disabling IPsec (not really an option as\nfar as I am concerned) we could increase that number to 40 MB/s.\n\nThe only conclusion I could come to was that we were not reaching our maximum\nbandwidth potential, but that I had no clue how to coax more speed out of the\nDRBD sync. Luckily I recalled reading about another magical DRBD feature.\n\n## Bring out the truck\n\nThe solution suggested by the DRBD documentation for situations like ours is\ncalled ['truck based\nreplication](https://drbd.linbit.com/users-guide/s-using-truck-based-replication.html).\nInstead of synchronizing 9TB of data, we would be telling DRBD to mark a point\nin time, take a full disk snapshot, move the snapshot to the new location (as a\nbox full of hard drives in a truck if needed), and then tell DRBD to get the\ndata at the new location up to date. During that 'catching-up' sync, DRBD would\nonly be resending those parts of the disk that actually changed since we marked\nthe point in time earlier. Because our users would not have written 9TB of new\ndata while the 'disks' were being shipped, we would have to sync much less than\n9TB.\n\n![Full replication versus 'truck' replication](https://about.gitlab.com/images/drbd/drbd-truck-sync.png)\n\nIn our case I would not have to use an actual truck; while testing the network\nthroughput between our old and new server I found that I could stream zeroes\nthrough SSH at about 35MB/s.\n\n```text\ndd if=/dev/zero bs=1M count=100 | ssh theo.gitlab.com dd of=/dev/null\n```\n\nAfter doing some testing with the leftover two-node staging setup I built\nearlier to figure out the networking I felt I could make this work. I followed\nthe steps in the DRBD documentation, made an LVM snapshot on the active origin\nserver, and started sending the snapshot to the new server with the following\nscript.\n\n```bash\n#!/bin/sh\nblock_count=100\nblock_size='8M'\nremote='54.93.71.23'\n\nsend_blocks() {\n  for skip in $(seq $1 ${block_count} $2) ; do\n    echo \"${skip}   $(date)\"\n    sudo dd if=/dev/gitlab_vg/truck bs=${block_size} count=${block_count} skip=${skip} status=noxfer iflag=fullblock \\\n    | ssh -T ${remote} sudo dd of=/dev/gitlab_vg/gitlab_com bs=${block_size} count=${block_count} seek=${skip} status=none iflag=fullblock\n  done\n}\n\ncheck_blocks() {\n  for skip in $(seq $2 ${block_count} $3) ; do\n    printf \"${skip}   \"\n    sudo dd if=$1 bs=${block_size} count=${block_count} skip=${skip} iflag=fullblock | md5sum\n  done\n}\n\ncase $1 in\n  send)\n    send_blocks $2 $3\n    ;;\n  check)\n    check_blocks $2 $3 $4\n    ;;\n  *)\n    echo \"Usage: $0 (send START END) | (check BLOCK_DEVICE START END)\"\n    exit 127\nesac\n```\n\nBy running this script in a [screen](http://www.gnu.org/software/screen/)\nsession I was able to copy the LVM snapshot `/dev/gitlab_vg/truck` from the old\nserver to the new server in about 3.5 days, 800 MB at a time. The 800MB number\nwas a bit of a coincidence, stemming from the recommendation from our Dutch\nhosters [NetCompany](http://www.netcompany.nl/) to use 8MB `dd`-blocks. Also\ncoincidentally, the total disk size was divisible by 8MB. If you have an eye\nfor system security you might notice that the script needed both root\nprivileges on the source server, and via short-lived unattended SSH sessions\ninto the remote server (`| ssh sudo ...`). This is not a normal thing for us to\ndo, and my colleagues got spammed by warning messages about it while this\nmigration was in progress.\n\nBecause I am a little paranoid, I was running a second instance of this script\nin parallel with the sync, where I was calculating MD5 checksums of all the\nblocks that were being sent across the network. By calculating the same\nchecksums on the migration target I could gain sufficient confidence that all\ndata made across without errors. If there would have been any, the script would\nhave made it easy to re-send an individual 800MB block.\n\nAt this point my spirits were lifting again and I told my teammates we would\nprobably need one extra day after the 'truck' stage before we could start using\nthe new server. I did not know yet that 'one day' would become 'one week'.\n\n## Shipping too much data\n\nAfter moving the big snapshot across the network with\n[dd](http://en.wikipedia.org/wiki/Dd_%28Unix%29) and SSH, the next step would\nbe to 'just turn DRBD on and let it catch up'. But that did not work all of a\nsudden! It took me a while to realize that the problem was that while trucking,\nI had sent _too much_ data to the new server (theo). If you recall the picture\nI drew earlier of the three-way DRBD replication then you can see that the goal\nwas to replicate the 'green box' from the old servers to the new server, while\nletting the old servers keep sharing the 'blue box' for redundancy.\n\n![Blue box on the left, green box on the\nright](https://about.gitlab.com/images/drbd/drbd-too-much-data.png)\n\nBut I had just sent a snapshot of the _blue_ box to theo (the server on the\nright), not just the green box. DRBD was refusing to turn back on theo,\nbecause it was expecting the green box, not the blue box (containing the green\nbox). More precisely, my disk on the new server contained metadata for drbd0 as\nwell as drbd10. DRBD finds its metadata by starting at the end of the disk and\nwalking backwards. Because of that, it was not seeing the drbd10 (green)\nmetadata on theo.\n\n![Two metadata block](https://about.gitlab.com/images/drbd/drbd-two-metadata-blocks.png)\n\nThe first thing I tried was to shrink the disk (with\n[LVM](http://en.wikipedia.org/wiki/Logical_Volume_Manager_%28Linux%29)) so that\nthe blue block at the end would fall off. Unfortunately, you can only grow and\nshrink LVM disks in fixed steps (4MB steps in our case), and those steps did\nnot align with where the drbd10 metadata (green box) ended on disk.\n\nThe next thing I tried was to erase the blue block. That would leave DRBD\nunable to find any metadata, because DRBD metadata must sit at the end of the\ndisk. To cope with that I tried and trick DRBD into thinking it was in the\nmiddle of a disk resize operation. By manually creating a doctored\n`/var/lib/drbd/drbd-minor-10.lkbd` file used by DRBD when it does a\n(legitimate) disk resize, I was pointing it to where I thought it could find\nthe green block of drbd10 metadata. To be honest this required more disk sector\narithmetic than I was comfortable with. Comfortable or not, I never got this\nprocedure to work without a few screens full of scary DRBD error messages so I\ndecided to call our first truck expedition a bust.\n\n## One last try\n\nWe had just spent four days waiting for a 9TB chunk of data to be transported\nto our new server only to find out that it was getting rejected by DRBD. The\nonly option that seemed left to us was to sit back and wait 50-60 days for a\nregular DRBD sync to happen. There was just this one last thing I wanted to try\nbefore giving up. The stumbling block at this point was getting DRBD on theo to\nfind the metadata for the drbd10 disk. From reading the documentation, I knew\nthat DRBD has metadata export and import commands. What if we would take a new\nLVM snapshot in Delft, take the disk offline and export its metadata, and then\non the other hand do a metadata import with the proper DRBD import command\n(instead of me writing zeroes to the disk and lying to DRBD about being in the\nmiddle of a resize). This would require us to redo the truck dance and wait\nfour days, but four days was still better than 50 days.\n\nUsing the staging setup I built at the start of this process (a good time\ninvestment!) I created a setup that allowed me to test three-way replication\nand truck-based replication at the same time. Without having to do any\narithmetic I came up with an intimidating but reliable sequence of commands to\n(1) initiate truck based replication and (2) export the DRBD metadata.\n\n```shell\nsudo lvremove -f gitlab_vg/truck\n## clear the bitmap to mark the sync point in time\nsudo drbdadm disconnect --stacked gitlab_data-stacked\nsudo drbdadm new-current-uuid --clear-bitmap --stacked gitlab_data-stacked/0\n## create a metadata dump\necho Yes | sudo gitlab-drbd slave\nsudo drbdadm primary gitlab_data\nsudo drbdadm apply-al --stacked gitlab_data-stacked\nsudo drbdadm dump-md --stacked gitlab_data-stacked > stacked-md-$(date +%s).txt\n## Create a block device snapshot\nsudo lvcreate -n truck -s --extents 50%FREE gitlab_vg/drbd\n## Turn gitlab back on\necho Yes |sudo gitlab-drbd slave\necho Yes |sudo gitlab-drbd master\n## Make sure the current node will 'win' as primary later on\nsudo drbdadm new-current-uuid --stacked gitlab_data-stacked/0\n```\n\nThis time I needed to take gitlab.com offline for a few minutes to be able to\ndo the metadata export. After that, a second waiting period of 4 days of\nstreaming the disk snapshot with `dd` and `ssh` commenced. And then came the\nbig moment of turning DRBD back on theo. It worked! Now I just had to wait\nfor the changes on disk of the last four days to be replicated (which took\nabout a day) and we were ready to flip the big switch, update the DNS and start\nserving gitlab.com from AWS. That final transition took another 10 minutes of\ndowntime, and then we were done.\n\n## Looking back\n\nAs soon as we flipped the switch and started operating out of AWS/Frankfurt,\ngitlab.com became noticeably more responsive. This is in spite of the fact that\nwe are _still_ running on a single server (an [AWS\nc3.8xlarge](http://aws.amazon.com/ec2/instance-types/#c3) instance at the\nmoment).\n\nCounting from the moment I was tasked to work on this data migration, we were\nable to move a 9TB filesystem to a different data center and hosting provider\nin three weeks, requiring 20 minutes of total downtime (spread over three\nmaintenance windows). We took an operational risk of prolonged downtime due to\noperator confusion in case of incidents, by deploying a new configuration that\nwhile tested to some degree was understood by only one member of the operations\nteam (myself). We were lucky that there was no incident during those three\nweeks that made this lack of shared knowledge a problem.\n\nNow if you will excuse me I have to go and explain to my colleagues how our\nnew gitlab.com infrastructure on AWS is set up. :)\n","engineering",{"slug":13,"featured":14,"template":15},"moving-all-your-data",false,"BlogPost",{"title":5,"description":17,"authors":18,"heroImage":19,"date":20,"body":10,"category":11},"At GitLab B.V. we are working on an infrastructure upgrade to give more CPU power and storage space to GitLab.com. Learn more here!",[9],"https://res.cloudinary.com/about-gitlab-com/image/upload/v1749684774/Blog/Hero%20Images/van.jpg","2015-03-09","yml",null,{},true,"/en-us/blog/moving-all-your-data","seo:\n  title: Moving all your data, 9TB edition\n  description: >-\n    At GitLab B.V. we are working on an infrastructure upgrade to give more CPU\n    power and storage space to GitLab.com. Learn more here!\n  ogTitle: Moving all your data, 9TB edition\n  ogDescription: >-\n    At GitLab B.V. we are working on an infrastructure upgrade to give more CPU\n    power and storage space to GitLab.com. Learn more here!\n  noIndex: false\n  ogImage: >-\n    https://res.cloudinary.com/about-gitlab-com/image/upload/v1749684774/Blog/Hero%20Images/van.jpg\n  ogUrl: https://about.gitlab.com/blog/moving-all-your-data\n  ogSiteName: https://about.gitlab.com\n  ogType: article\n  canonicalUrls: https://about.gitlab.com/blog/moving-all-your-data\ncontent:\n  title: Moving all your data, 9TB edition\n  description: >-\n    At GitLab B.V. we are working on an infrastructure upgrade to give more CPU\n    power and storage space to GitLab.com. Learn more here!\n  authors:\n    - Jacob Vosmaer\n  heroImage: >-\n    https://res.cloudinary.com/about-gitlab-com/image/upload/v1749684774/Blog/Hero%20Images/van.jpg\n  date: '2015-03-09'\n  body: >\n    At GitLab B.V. we are working on an infrastructure upgrade to give more CPU\n\n    power and storage space to GitLab.com. (We are currently still running on a\n\n    [single server](/blog/the-hardware-that-powers-100k-git-repos/).) As a\n\n    part of this upgrade we wanted to move gitlab.com from our own dedicated\n\n    hardware servers to an AWS data center 400 kilometers away.  In this blog\n    post\n\n    I will tell you how I did that and what challenges I had to overcome. An\n    epic\n\n    adventure of hand-rolled network tunnels, advanced DRBD features and\n    streaming\n\n    9TB of data through SSH pipes!\n\n\n    \u003C!-- more -->\n\n\n    ## What did I have to move?\n\n\n    In our current setup we run a stock GitLab Enterprise Edition omnibus\n    package,\n\n    with a single big filesystem mounted at `/var/opt/gitlab`. This\n\n    filesystem holds all the user data hosted on gitlab.com: Postgres and Redis\n\n    database files, user uploads, and a lot of Git repositories. All I had to do\n\n    to move this data to AWS is to move the files on this filesystem. Sounds\n    simple\n\n    enough, does it not?\n\n\n    So do we move the files, or the filesystem itself? This is an easy question\n    to\n\n    answer. Moving the files using something like Rsync is not an option because\n    it\n\n    is just too slow. We do file-based backups every week where we take a block\n\n    device snapshot, mount the snapshot and send it across with Rsync. That\n\n    currently takes over 24 hours, and 24 hours of downtime while we move\n\n    gitlab.com is not a nice idea. Now you might ask: what if you Rsync once to\n\n    prepare, take the server offline, and then do a quick Rsync just to catch\n    up?\n\n    That would still take hours just for Rsync to walk through all the files and\n\n    directories on disk. No good.\n\n\n    We have faced and solved this same problem in the past when the amount of\n    data\n\n    was 5 times smaller. (Rsync was not an option even then.) What I did at that\n\n    time was to use DRBD to move not just the files themselves, but the whole\n\n    filesystem they sit on. This time around DRBD again seemed like the best\n\n    solution for us. It is not the fastest solution to move a lot of data, but\n    what\n\n    is great about it is that you can keep using the filesystem while the data\n    is\n\n    being moved, and changes will get synchronized continuously. No downtime for\n\n    our users! (Except maybe 5 minutes at the start to set up the sync.)\n\n\n    ## What is DRBD?\n\n\n    [DRBD](http://www.drbd.org) is a system that can create a virtual hard drive\n\n    (block device) on a Linux computer that gets mirrored across a network\n\n    connection to a second Linux computer. Both computers give a 'real' hard\n    drive\n\n    to DRBD, and DRBD keeps the contents of the real hard drive the same across\n\n    both computers via the network. One of the two computers gets a virtual hard\n\n    drive from DRBD, which shows the contents of the real hard drive underneath.\n    If\n\n    your first computer crashes, you can 'plug in' the virtual hard drive on the\n\n    second computer in a matter of seconds, and all your data will still be\n    there\n\n    because DRBD kept the 'real' hard drives in sync for you. You can even have\n    the\n\n    two computers that are linked by DRBD sit in different buildings, or on\n\n    different continents. Up until our move to AWS, we were using DRBD to\n    protect\n\n    against hardware failure on the server that runs gitlab.com: if such a\n    failure\n\n    would happen, we could just plug in the virtual hard drive with the user\n    data\n\n    into our stand-by server. In our new data center, the hosting provider\n    (Amazon\n\n    Web Services) has their own solution for plugging virtual hard drives in and\n\n    out called Elastic Block Storage, so we are no longer using DRBD as a\n    virtual\n\n    hard drive. From an availability standpoint this is not better or worse, but\n\n    using EBS drives does make it a lot easier for us to make backups because\n    now\n\n    we can just store snapshots (no more Rsync).\n\n\n    ## Using DRBD for a data migration\n\n\n    Although DRBD is not really made for this purpose, I felt confident using\n    DRBD\n\n    for the migration because I had done it before for a migration between data\n\n    centers. At that time we were moving across the Atlantic Ocean; this time we\n\n    would only be moving from the Netherlands to Germany.  However, the last\n    time\n\n    we used DRBD only as a one-off tool. In our pre-migration setup, we were\n\n    already using DRBD to replicate the filesystem between two servers in the\n    same\n\n    rack. DRBD only lets you share a virtual hard drive between two computers,\n    so\n\n    how do we now send the data to a _third_ computer in the new data center?\n\n\n    Luckily, DRBD actually has a trick up its sleeve to deal with this, called\n\n    'stacked resources'. This means that our old servers ('linus' and 'monty')\n\n    would share a virtual hard drive called 'drbd0', and that whoever of the two\n\n    has the 'drbd0' virtual hard drive plugged in gets to use 'drbd0' as the\n    'real'\n\n    hard drive underneath a second virtual hard drive, called 'drbd10', which is\n\n    shared with the new server ('theo'). Also see the picture below.\n\n\n    ![Stacked DRBD\n    replication](https://about.gitlab.com/images/drbd/drbd-three-nodes.png)\n\n\n    If linus would malfunction, we could attach drbd0 (the blue virtual hard\n    drive)\n\n    on monty and keep gitlab.com going. The 'green' replication (to get the data\n    to\n\n    theo) would also be able to continue, even after a failover to monty.\n\n\n    ## Networking\n\n\n    I liked the picture above, so 'all' I had to do was set it up. That ended up\n\n    taking a few days, just to set up a test environment, and to figure out how\n    to\n\n    create a network tunnel for the green traffic. The network tunnel needed to\n\n    have a movable endpoint depending on whether linus or monty was primary. We\n\n    also needed the tunnel because DRBD is not compatible with the [Network\n    Address\n\n    Translation](http://en.wikipedia.org/wiki/Network_address_translation) used\n    by\n\n    AWS. DRBD assumes that whenever a node listens on an IP address, it is also\n\n    reachable for its partner node at that IP address. On AWS on the other hand,\n    a\n\n    node will have one or more internal IP addresses, which are distinct from\n    its\n\n    _public_ IP address.\n\n\n    We chose to work around this with an [IPIP\n\n    tunnel](http://en.wikipedia.org/wiki/IP_in_IP) and manually keyed IPsec\n\n    encryption. Previous experiments indicated that this gave us the best\n    network\n\n    throughput compared to OpenVPN and GRE tunnels.\n\n\n    To set up the tunnel I used a shell script that was kept in sync on all\n    three\n\n    servers involved in the migration by Chef.\n\n\n    ```bash\n\n    # Network tunnel configuration script used by GitLab B.V. to migrate data\n    from\n\n    # Delft to Frankfurt\n\n\n    #!/bin/sh\n\n    set -u\n\n\n    PATH=/usr/sbin:/sbin:/usr/bin:/bin\n\n\n    frankfurt_public=54.93.71.23\n\n    frankfurt_replication=172.16.228.2\n\n    test_public=54.152.127.180\n\n    test_replication=172.16.228.1\n\n    delft_public=62.204.93.103\n\n    delft_replication=172.16.228.1\n\n\n    create_tunipip() {\n      if ! ip tunnel show | grep -q tunIPIP ; then\n        echo Creating tunnel tunIPIP\n        ip tunnel add tunIPIP mode ipip ttl 64 local \"$1\" remote \"$2\"\n      fi\n    }\n\n\n    add_tunnel_address() {\n      if ! ip address show tunIPIP | grep -q \"$1\" ; then\n        ip address add \"$1/32\" peer \"$2/32\" dev tunIPIP\n      fi\n    }\n\n\n    case $(hostname) in\n      ip-10-0-2-9)\n        create_tunipip 10.0.2.140 \"${frankfurt_public}\"\n        add_tunnel_address \"${test_replication}\" \"${frankfurt_replication}\"\n        ip link set tunIPIP up\n        ;;\n      ip-10-0-2-245)\n        create_tunipip 10.0.2.11 \"${frankfurt_public}\"\n        add_tunnel_address \"${test_replication}\" \"${frankfurt_replication}\"\n        ip link set tunIPIP up\n        ;;\n      ip-10-1-0-52|theo.gitlab.com)\n        create_tunipip 10.1.0.52 \"${delft_public}\"\n        add_tunnel_address \"${frankfurt_replication}\" \"${delft_replication}\"\n        ip link set tunIPIP up\n        ;;\n      linus|monty)\n        create_tunipip \"${delft_public}\" \"${frankfurt_public}\"\n        add_tunnel_address \"${delft_replication}\" \"${frankfurt_replication}\"\n        ip link set tunIPIP up\n        ;;\n    esac\n\n    ```\n\n\n    This script was configured to run on boot. Note that it covers our Delft\n    nodes\n\n    (linus and monty, then current production), the node we were migrating to in\n\n    Frankfurt (theo), and two AWS test nodes that were part of a staging setup.\n    We\n\n    chose the AWS Frankfurt (Germany) data center because of its geographic\n\n    proximity to Delft (The Netherlands).\n\n\n    We configured IPsec with `/etc/ipsec-tools.conf`. An example for the\n    'origin'\n\n    configuration would be:\n\n\n    ```text\n\n    #!/usr/sbin/setkey -f\n\n\n    # Configuration for 172.16.228.1\n\n\n    # Flush the SAD and SPD\n\n    flush;\n\n    spdflush;\n\n\n    # Attention: Use this keys only for testing purposes!\n\n    # Generate your own keys!\n\n\n    # AH SAs using 128 bit long keys\n\n    # Fill in your keys below!\n\n    add 172.16.228.1 172.16.228.2 ah 0x200 -A hmac-md5 0xfoobar;\n\n    add 172.16.228.2 172.16.228.1 ah 0x300 -A hmac-md5 0xbarbaz;\n\n\n    # ESP SAs using 192 bit long keys (168 + 24 parity)\n\n    # Fill in your keys below!\n\n    add 172.16.228.1 172.16.228.2 esp 0x201 -E 3des-cbc 0xquxfoo;\n\n    add 172.16.228.2 172.16.228.1 esp 0x301 -E 3des-cbc 0xbazqux;\n\n\n    # Security policies\n\n    # outbound traffic from 172.16.228.1 to 172.16.228.2\n\n    spdadd 172.16.228.1 172.16.228.2 any -P out ipsec esp/transport//require\n    ah/transport//require;\n\n\n    # inbound traffic from 172.16.228.2 to 172.16.228.1\n\n    spdadd 172.16.228.2 172.16.228.1 any -P in ipsec esp/transport//require\n    ah/transport//require;\n\n    ```\n\n\n    Getting the networking to this point took quite some work. For starters, we\n    did\n\n    not have a staging environment similar enough to our production environment,\n    so\n\n    I had to create one for this occasion.\n\n\n    On top of that, to model our production setup, I had to use an AWS 'Virtual\n\n    Private Cloud', which was new technology for us. It took a while before I\n\n    found some [vital information about using multiple IP\n\n    addresses](http://engineering.silk.co/post/31923247961/multiple-ip-addresses-on-amazon-ec2)\n\n    that was not obvious from the AWS documentation: if you want to have two\n    public\n\n    IP addresses on an AWS VPC node, you need to put two corresponding private\n    IP\n\n    addresses on one 'Elastic Network Interface', instead of creating two\n    network\n\n    interfaces with one private IP each.\n\n\n    ## Configuring three-way DRBD replication\n\n\n    With the basic networking figured out the next thing I had to do was to\n    adapt\n\n    our production failover script so that we maintain redundancy while\n    migrating\n\n    the data. 'Failover' is a procedure where you move a service (gitlab.com)\n    ove\n\n    to a different computer after a failure. Our failover procedure is managed\n    by a\n\n    script. My goal was to make sure that if one of our production servers\n    failed,\n\n    any teammate of mine on pager duty would be able to restore the gitlab.com\n\n    service using our normal failover procedure. That meant I had to update the\n\n    script to use the new three-way DRBD configuration.\n\n\n    I certainly got a little more familiar with tcpdump (`tcpdump -n -i\n\n    INTERFACE`), having multiple layers of firewalls\n\n    ([UFW](http://en.wikipedia.org/wiki/Uncomplicated_Firewall) and AWS\n    [Security\n\n    Groups](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-network-security.html)),\n\n    and how to get any useful log messages from DRBD:\n\n\n    ```shell\n\n    # Monitor DRBD log messages\n\n    sudo tail -f /var/log/messages | grep -e drbd -e d-con\n\n    ```\n\n\n    I later learned that I actually deployed a new version of the failover\n    script\n\n    with a bug in it that potentially could have confused the hell out of my\n\n    teammates had they had to use it under duress. Luckily we never actually\n    needed\n\n    the failover procedure during the time the new script was in production.\n\n\n    But, even though I was introducing new complexity and hence bugs into our\n\n    failover tooling, I did manage to learn and try out enough things to bring\n    this\n\n    project to a successful conclusion.\n\n\n    ## Enabling the DRBD replication\n\n\n    This part was relatively easy. I just had to grow the DRBD block device\n\n    'drbd0' so that it could accommodate the new stacked (inner) block device\n\n    'drbd10' without having to shrink our production filesystem. Because drbd0\n    was\n\n    backed by LVM and we had some space left this was a matter of invoking\n\n    `lvextend` and `drbdadm resize` on both our production nodes.\n\n\n    The step after this was the first one where I had to take gitlab.com\n    offline.\n\n    In order to 'activate' drbd10 and start the synchronization, I had to\n    unmount\n\n    `/dev/drbd0` from `/var/opt/gitlab` and mount `/dev/drbd10` in its place.\n    This\n\n    took less than 5 minutes. After this the actual migration was under way!\n\n\n    ## Too slow\n\n\n    At this point I was briefly excited to be able to share some good news with\n    the\n\n    rest of the team. While staring about the DRBD progress bar for the\n\n    synchronization I started to realize however that the progress bar was\n    telling\n\n    me that the synchronization would take about 50-60 days at 2MB/s.\n\n\n    This prognosis was an improvement over what we would expect based on our\n\n    previous experience moving 1.8TB from North Virginia (US) to Delft (NL) in\n\n    about two weeks (across the Atlantic Ocean!). If one would extrapolate that\n\n    rate you would expect moving 9TB to take 70 days. We were disappointed\n\n    nonetheless because we were hoping that we would gain more throughput by\n    moving\n\n    over a shorter distance this time around (Delft and Frankfurt are about\n    400km\n\n    apart).\n\n\n    The first thing I started looking into at this point was whether we could\n\n    somehow make better use of the network bandwidth at our disposal. Sending\n    fake\n\n    data (zeroes) over the (encrypted) IPIP tunnel (`dd if=/dev/zero | nc\n    remote_ip\n\n    1234`) we could get about 17 MB/s. By disabling IPsec (not really an option\n    as\n\n    far as I am concerned) we could increase that number to 40 MB/s.\n\n\n    The only conclusion I could come to was that we were not reaching our\n    maximum\n\n    bandwidth potential, but that I had no clue how to coax more speed out of\n    the\n\n    DRBD sync. Luckily I recalled reading about another magical DRBD feature.\n\n\n    ## Bring out the truck\n\n\n    The solution suggested by the DRBD documentation for situations like ours is\n\n    called ['truck based\n\n    replication](https://drbd.linbit.com/users-guide/s-using-truck-based-replication.html).\n\n    Instead of synchronizing 9TB of data, we would be telling DRBD to mark a\n    point\n\n    in time, take a full disk snapshot, move the snapshot to the new location\n    (as a\n\n    box full of hard drives in a truck if needed), and then tell DRBD to get the\n\n    data at the new location up to date. During that 'catching-up' sync, DRBD\n    would\n\n    only be resending those parts of the disk that actually changed since we\n    marked\n\n    the point in time earlier. Because our users would not have written 9TB of\n    new\n\n    data while the 'disks' were being shipped, we would have to sync much less\n    than\n\n    9TB.\n\n\n    ![Full replication versus 'truck'\n    replication](https://about.gitlab.com/images/drbd/drbd-truck-sync.png)\n\n\n    In our case I would not have to use an actual truck; while testing the\n    network\n\n    throughput between our old and new server I found that I could stream zeroes\n\n    through SSH at about 35MB/s.\n\n\n    ```text\n\n    dd if=/dev/zero bs=1M count=100 | ssh theo.gitlab.com dd of=/dev/null\n\n    ```\n\n\n    After doing some testing with the leftover two-node staging setup I built\n\n    earlier to figure out the networking I felt I could make this work. I\n    followed\n\n    the steps in the DRBD documentation, made an LVM snapshot on the active\n    origin\n\n    server, and started sending the snapshot to the new server with the\n    following\n\n    script.\n\n\n    ```bash\n\n    #!/bin/sh\n\n    block_count=100\n\n    block_size='8M'\n\n    remote='54.93.71.23'\n\n\n    send_blocks() {\n      for skip in $(seq $1 ${block_count} $2) ; do\n        echo \"${skip}   $(date)\"\n        sudo dd if=/dev/gitlab_vg/truck bs=${block_size} count=${block_count} skip=${skip} status=noxfer iflag=fullblock \\\n        | ssh -T ${remote} sudo dd of=/dev/gitlab_vg/gitlab_com bs=${block_size} count=${block_count} seek=${skip} status=none iflag=fullblock\n      done\n    }\n\n\n    check_blocks() {\n      for skip in $(seq $2 ${block_count} $3) ; do\n        printf \"${skip}   \"\n        sudo dd if=$1 bs=${block_size} count=${block_count} skip=${skip} iflag=fullblock | md5sum\n      done\n    }\n\n\n    case $1 in\n      send)\n        send_blocks $2 $3\n        ;;\n      check)\n        check_blocks $2 $3 $4\n        ;;\n      *)\n        echo \"Usage: $0 (send START END) | (check BLOCK_DEVICE START END)\"\n        exit 127\n    esac\n\n    ```\n\n\n    By running this script in a [screen](http://www.gnu.org/software/screen/)\n\n    session I was able to copy the LVM snapshot `/dev/gitlab_vg/truck` from the\n    old\n\n    server to the new server in about 3.5 days, 800 MB at a time. The 800MB\n    number\n\n    was a bit of a coincidence, stemming from the recommendation from our Dutch\n\n    hosters [NetCompany](http://www.netcompany.nl/) to use 8MB `dd`-blocks. Also\n\n    coincidentally, the total disk size was divisible by 8MB. If you have an eye\n\n    for system security you might notice that the script needed both root\n\n    privileges on the source server, and via short-lived unattended SSH sessions\n\n    into the remote server (`| ssh sudo ...`). This is not a normal thing for us\n    to\n\n    do, and my colleagues got spammed by warning messages about it while this\n\n    migration was in progress.\n\n\n    Because I am a little paranoid, I was running a second instance of this\n    script\n\n    in parallel with the sync, where I was calculating MD5 checksums of all the\n\n    blocks that were being sent across the network. By calculating the same\n\n    checksums on the migration target I could gain sufficient confidence that\n    all\n\n    data made across without errors. If there would have been any, the script\n    would\n\n    have made it easy to re-send an individual 800MB block.\n\n\n    At this point my spirits were lifting again and I told my teammates we would\n\n    probably need one extra day after the 'truck' stage before we could start\n    using\n\n    the new server. I did not know yet that 'one day' would become 'one week'.\n\n\n    ## Shipping too much data\n\n\n    After moving the big snapshot across the network with\n\n    [dd](http://en.wikipedia.org/wiki/Dd_%28Unix%29) and SSH, the next step\n    would\n\n    be to 'just turn DRBD on and let it catch up'. But that did not work all of\n    a\n\n    sudden! It took me a while to realize that the problem was that while\n    trucking,\n\n    I had sent _too much_ data to the new server (theo). If you recall the\n    picture\n\n    I drew earlier of the three-way DRBD replication then you can see that the\n    goal\n\n    was to replicate the 'green box' from the old servers to the new server,\n    while\n\n    letting the old servers keep sharing the 'blue box' for redundancy.\n\n\n    ![Blue box on the left, green box on the\n\n    right](https://about.gitlab.com/images/drbd/drbd-too-much-data.png)\n\n\n    But I had just sent a snapshot of the _blue_ box to theo (the server on the\n\n    right), not just the green box. DRBD was refusing to turn back on theo,\n\n    because it was expecting the green box, not the blue box (containing the\n    green\n\n    box). More precisely, my disk on the new server contained metadata for drbd0\n    as\n\n    well as drbd10. DRBD finds its metadata by starting at the end of the disk\n    and\n\n    walking backwards. Because of that, it was not seeing the drbd10 (green)\n\n    metadata on theo.\n\n\n    ![Two metadata\n    block](https://about.gitlab.com/images/drbd/drbd-two-metadata-blocks.png)\n\n\n    The first thing I tried was to shrink the disk (with\n\n    [LVM](http://en.wikipedia.org/wiki/Logical_Volume_Manager_%28Linux%29)) so\n    that\n\n    the blue block at the end would fall off. Unfortunately, you can only grow\n    and\n\n    shrink LVM disks in fixed steps (4MB steps in our case), and those steps did\n\n    not align with where the drbd10 metadata (green box) ended on disk.\n\n\n    The next thing I tried was to erase the blue block. That would leave DRBD\n\n    unable to find any metadata, because DRBD metadata must sit at the end of\n    the\n\n    disk. To cope with that I tried and trick DRBD into thinking it was in the\n\n    middle of a disk resize operation. By manually creating a doctored\n\n    `/var/lib/drbd/drbd-minor-10.lkbd` file used by DRBD when it does a\n\n    (legitimate) disk resize, I was pointing it to where I thought it could find\n\n    the green block of drbd10 metadata. To be honest this required more disk\n    sector\n\n    arithmetic than I was comfortable with. Comfortable or not, I never got this\n\n    procedure to work without a few screens full of scary DRBD error messages so\n    I\n\n    decided to call our first truck expedition a bust.\n\n\n    ## One last try\n\n\n    We had just spent four days waiting for a 9TB chunk of data to be\n    transported\n\n    to our new server only to find out that it was getting rejected by DRBD. The\n\n    only option that seemed left to us was to sit back and wait 50-60 days for a\n\n    regular DRBD sync to happen. There was just this one last thing I wanted to\n    try\n\n    before giving up. The stumbling block at this point was getting DRBD on theo\n    to\n\n    find the metadata for the drbd10 disk. From reading the documentation, I\n    knew\n\n    that DRBD has metadata export and import commands. What if we would take a\n    new\n\n    LVM snapshot in Delft, take the disk offline and export its metadata, and\n    then\n\n    on the other hand do a metadata import with the proper DRBD import command\n\n    (instead of me writing zeroes to the disk and lying to DRBD about being in\n    the\n\n    middle of a resize). This would require us to redo the truck dance and wait\n\n    four days, but four days was still better than 50 days.\n\n\n    Using the staging setup I built at the start of this process (a good time\n\n    investment!) I created a setup that allowed me to test three-way replication\n\n    and truck-based replication at the same time. Without having to do any\n\n    arithmetic I came up with an intimidating but reliable sequence of commands\n    to\n\n    (1) initiate truck based replication and (2) export the DRBD metadata.\n\n\n    ```shell\n\n    sudo lvremove -f gitlab_vg/truck\n\n    ## clear the bitmap to mark the sync point in time\n\n    sudo drbdadm disconnect --stacked gitlab_data-stacked\n\n    sudo drbdadm new-current-uuid --clear-bitmap --stacked gitlab_data-stacked/0\n\n    ## create a metadata dump\n\n    echo Yes | sudo gitlab-drbd slave\n\n    sudo drbdadm primary gitlab_data\n\n    sudo drbdadm apply-al --stacked gitlab_data-stacked\n\n    sudo drbdadm dump-md --stacked gitlab_data-stacked > stacked-md-$(date\n    +%s).txt\n\n    ## Create a block device snapshot\n\n    sudo lvcreate -n truck -s --extents 50%FREE gitlab_vg/drbd\n\n    ## Turn gitlab back on\n\n    echo Yes |sudo gitlab-drbd slave\n\n    echo Yes |sudo gitlab-drbd master\n\n    ## Make sure the current node will 'win' as primary later on\n\n    sudo drbdadm new-current-uuid --stacked gitlab_data-stacked/0\n\n    ```\n\n\n    This time I needed to take gitlab.com offline for a few minutes to be able\n    to\n\n    do the metadata export. After that, a second waiting period of 4 days of\n\n    streaming the disk snapshot with `dd` and `ssh` commenced. And then came the\n\n    big moment of turning DRBD back on theo. It worked! Now I just had to wait\n\n    for the changes on disk of the last four days to be replicated (which took\n\n    about a day) and we were ready to flip the big switch, update the DNS and\n    start\n\n    serving gitlab.com from AWS. That final transition took another 10 minutes\n    of\n\n    downtime, and then we were done.\n\n\n    ## Looking back\n\n\n    As soon as we flipped the switch and started operating out of AWS/Frankfurt,\n\n    gitlab.com became noticeably more responsive. This is in spite of the fact\n    that\n\n    we are _still_ running on a single server (an [AWS\n\n    c3.8xlarge](http://aws.amazon.com/ec2/instance-types/#c3) instance at the\n\n    moment).\n\n\n    Counting from the moment I was tasked to work on this data migration, we\n    were\n\n    able to move a 9TB filesystem to a different data center and hosting\n    provider\n\n    in three weeks, requiring 20 minutes of total downtime (spread over three\n\n    maintenance windows). We took an operational risk of prolonged downtime due\n    to\n\n    operator confusion in case of incidents, by deploying a new configuration\n    that\n\n    while tested to some degree was understood by only one member of the\n    operations\n\n    team (myself). We were lucky that there was no incident during those three\n\n    weeks that made this lack of shared knowledge a problem.\n\n\n    Now if you will excuse me I have to go and explain to my colleagues how our\n\n    new gitlab.com infrastructure on AWS is set up. :)\n  category: engineering\nconfig:\n  slug: moving-all-your-data\n  featured: false\n  template: BlogPost\n",{"title":5,"description":17,"ogTitle":5,"ogDescription":17,"noIndex":14,"ogImage":19,"ogUrl":28,"ogSiteName":29,"ogType":30,"canonicalUrls":28},"https://about.gitlab.com/blog/moving-all-your-data","https://about.gitlab.com","article","en-us/blog/moving-all-your-data",[],"GJ_U3ABv25CbpX5lWBZckLS6d73fAC-kiJrOfcwNoDI",{"data":35},{"logo":36,"freeTrial":41,"sales":46,"login":51,"items":56,"search":365,"minimal":396,"duo":415,"switchNav":424,"pricingDeployment":435},{"config":37},{"href":38,"dataGaName":39,"dataGaLocation":40},"/","gitlab logo","header",{"text":42,"config":43},"Get free trial",{"href":44,"dataGaName":45,"dataGaLocation":40},"https://gitlab.com/-/trial_registrations/new?glm_source=about.gitlab.com&glm_content=default-saas-trial/","free trial",{"text":47,"config":48},"Talk to sales",{"href":49,"dataGaName":50,"dataGaLocation":40},"/sales/","sales",{"text":52,"config":53},"Sign in",{"href":54,"dataGaName":55,"dataGaLocation":40},"https://gitlab.com/users/sign_in/","sign in",[57,84,179,184,286,346],{"text":58,"config":59,"cards":61},"Platform",{"dataNavLevelOne":60},"platform",[62,68,76],{"title":58,"description":63,"link":64},"The intelligent orchestration platform for DevSecOps",{"text":65,"config":66},"Explore our Platform",{"href":67,"dataGaName":60,"dataGaLocation":40},"/platform/",{"title":69,"description":70,"link":71},"GitLab Duo Agent Platform","Agentic AI for the entire software lifecycle",{"text":72,"config":73},"Meet GitLab Duo",{"href":74,"dataGaName":75,"dataGaLocation":40},"/gitlab-duo-agent-platform/","gitlab duo agent platform",{"title":77,"description":78,"link":79},"Why GitLab","See the top reasons enterprises choose GitLab",{"text":80,"config":81},"Learn more",{"href":82,"dataGaName":83,"dataGaLocation":40},"/why-gitlab/","why gitlab",{"text":85,"left":24,"config":86,"link":88,"lists":92,"footer":161},"Product",{"dataNavLevelOne":87},"solutions",{"text":89,"config":90},"View all Solutions",{"href":91,"dataGaName":87,"dataGaLocation":40},"/solutions/",[93,117,140],{"title":94,"description":95,"link":96,"items":101},"Automation","CI/CD and automation to accelerate deployment",{"config":97},{"icon":98,"href":99,"dataGaName":100,"dataGaLocation":40},"AutomatedCodeAlt","/solutions/delivery-automation/","automated software delivery",[102,106,109,113],{"text":103,"config":104},"CI/CD",{"href":105,"dataGaLocation":40,"dataGaName":103},"/solutions/continuous-integration/",{"text":69,"config":107},{"href":74,"dataGaLocation":40,"dataGaName":108},"gitlab duo agent platform - product menu",{"text":110,"config":111},"Source Code Management",{"href":112,"dataGaLocation":40,"dataGaName":110},"/solutions/source-code-management/",{"text":114,"config":115},"Automated Software Delivery",{"href":99,"dataGaLocation":40,"dataGaName":116},"Automated software delivery",{"title":118,"description":119,"link":120,"items":125},"Security","Deliver code faster without compromising security",{"config":121},{"href":122,"dataGaName":123,"dataGaLocation":40,"icon":124},"/solutions/application-security-testing/","security and compliance","ShieldCheckLight",[126,130,135],{"text":127,"config":128},"Application Security Testing",{"href":122,"dataGaName":129,"dataGaLocation":40},"Application security testing",{"text":131,"config":132},"Software Supply Chain Security",{"href":133,"dataGaLocation":40,"dataGaName":134},"/solutions/supply-chain/","Software supply chain security",{"text":136,"config":137},"Software Compliance",{"href":138,"dataGaName":139,"dataGaLocation":40},"/solutions/software-compliance/","software compliance",{"title":141,"link":142,"items":147},"Measurement",{"config":143},{"icon":144,"href":145,"dataGaName":146,"dataGaLocation":40},"DigitalTransformation","/solutions/visibility-measurement/","visibility and measurement",[148,152,156],{"text":149,"config":150},"Visibility & Measurement",{"href":145,"dataGaLocation":40,"dataGaName":151},"Visibility and Measurement",{"text":153,"config":154},"Value Stream Management",{"href":155,"dataGaLocation":40,"dataGaName":153},"/solutions/value-stream-management/",{"text":157,"config":158},"Analytics & Insights",{"href":159,"dataGaLocation":40,"dataGaName":160},"/solutions/analytics-and-insights/","Analytics and insights",{"title":162,"items":163},"GitLab for",[164,169,174],{"text":165,"config":166},"Enterprise",{"href":167,"dataGaLocation":40,"dataGaName":168},"/enterprise/","enterprise",{"text":170,"config":171},"Small Business",{"href":172,"dataGaLocation":40,"dataGaName":173},"/small-business/","small business",{"text":175,"config":176},"Public Sector",{"href":177,"dataGaLocation":40,"dataGaName":178},"/solutions/public-sector/","public sector",{"text":180,"config":181},"Pricing",{"href":182,"dataGaName":183,"dataGaLocation":40,"dataNavLevelOne":183},"/pricing/","pricing",{"text":185,"config":186,"link":188,"lists":192,"feature":277},"Resources",{"dataNavLevelOne":187},"resources",{"text":189,"config":190},"View all resources",{"href":191,"dataGaName":187,"dataGaLocation":40},"/resources/",[193,226,249],{"title":194,"items":195},"Getting started",[196,201,206,211,216,221],{"text":197,"config":198},"Install",{"href":199,"dataGaName":200,"dataGaLocation":40},"/install/","install",{"text":202,"config":203},"Quick start guides",{"href":204,"dataGaName":205,"dataGaLocation":40},"/get-started/","quick setup checklists",{"text":207,"config":208},"Learn",{"href":209,"dataGaLocation":40,"dataGaName":210},"https://university.gitlab.com/","learn",{"text":212,"config":213},"Product documentation",{"href":214,"dataGaName":215,"dataGaLocation":40},"https://docs.gitlab.com/","product documentation",{"text":217,"config":218},"Best practice videos",{"href":219,"dataGaName":220,"dataGaLocation":40},"/getting-started-videos/","best practice videos",{"text":222,"config":223},"Integrations",{"href":224,"dataGaName":225,"dataGaLocation":40},"/integrations/","integrations",{"title":227,"items":228},"Discover",[229,234,239,244],{"text":230,"config":231},"Customer success stories",{"href":232,"dataGaName":233,"dataGaLocation":40},"/customers/","customer success stories",{"text":235,"config":236},"Blog",{"href":237,"dataGaName":238,"dataGaLocation":40},"/blog/","blog",{"text":240,"config":241},"The Source",{"href":242,"dataGaName":243,"dataGaLocation":40},"/the-source/","the source",{"text":245,"config":246},"Remote",{"href":247,"dataGaName":248,"dataGaLocation":40},"https://handbook.gitlab.com/handbook/company/culture/all-remote/","remote",{"title":250,"items":251},"Connect",[252,257,262,267,272],{"text":253,"config":254},"GitLab Services",{"href":255,"dataGaName":256,"dataGaLocation":40},"/services/","services",{"text":258,"config":259},"Community",{"href":260,"dataGaName":261,"dataGaLocation":40},"/community/","community",{"text":263,"config":264},"Forum",{"href":265,"dataGaName":266,"dataGaLocation":40},"https://forum.gitlab.com/","forum",{"text":268,"config":269},"Events",{"href":270,"dataGaName":271,"dataGaLocation":40},"/events/","events",{"text":273,"config":274},"Partners",{"href":275,"dataGaName":276,"dataGaLocation":40},"/partners/","partners",{"textColor":278,"title":279,"text":280,"link":281},"#000","What’s new in GitLab","Stay updated with our latest features and improvements.",{"text":282,"config":283},"Read the latest",{"href":284,"dataGaName":285,"dataGaLocation":40},"/releases/whats-new/","whats new",{"text":287,"config":288,"lists":290},"Company",{"dataNavLevelOne":289},"company",[291],{"items":292},[293,298,304,306,311,316,321,326,331,336,341],{"text":294,"config":295},"About",{"href":296,"dataGaName":297,"dataGaLocation":40},"/company/","about",{"text":299,"config":300,"footerGa":303},"Jobs",{"href":301,"dataGaName":302,"dataGaLocation":40},"/jobs/","jobs",{"dataGaName":302},{"text":268,"config":305},{"href":270,"dataGaName":271,"dataGaLocation":40},{"text":307,"config":308},"Leadership",{"href":309,"dataGaName":310,"dataGaLocation":40},"/company/team/e-group/","leadership",{"text":312,"config":313},"Team",{"href":314,"dataGaName":315,"dataGaLocation":40},"/company/team/","team",{"text":317,"config":318},"Handbook",{"href":319,"dataGaName":320,"dataGaLocation":40},"https://handbook.gitlab.com/","handbook",{"text":322,"config":323},"Investor relations",{"href":324,"dataGaName":325,"dataGaLocation":40},"https://ir.gitlab.com/","investor relations",{"text":327,"config":328},"Trust Center",{"href":329,"dataGaName":330,"dataGaLocation":40},"/security/","trust center",{"text":332,"config":333},"AI Transparency Center",{"href":334,"dataGaName":335,"dataGaLocation":40},"/ai-transparency-center/","ai transparency center",{"text":337,"config":338},"Newsletter",{"href":339,"dataGaName":340,"dataGaLocation":40},"/company/contact/#contact-forms","newsletter",{"text":342,"config":343},"Press",{"href":344,"dataGaName":345,"dataGaLocation":40},"/press/","press",{"text":347,"config":348,"lists":349},"Contact us",{"dataNavLevelOne":289},[350],{"items":351},[352,355,360],{"text":47,"config":353},{"href":49,"dataGaName":354,"dataGaLocation":40},"talk to sales",{"text":356,"config":357},"Support portal",{"href":358,"dataGaName":359,"dataGaLocation":40},"https://support.gitlab.com","support portal",{"text":361,"config":362},"Customer portal",{"href":363,"dataGaName":364,"dataGaLocation":40},"https://customers.gitlab.com/customers/sign_in/","customer portal",{"close":366,"login":367,"suggestions":374},"Close",{"text":368,"link":369},"To search repositories and projects, login to",{"text":370,"config":371},"gitlab.com",{"href":54,"dataGaName":372,"dataGaLocation":373},"search login","search",{"text":375,"default":376},"Suggestions",[377,379,383,385,389,393],{"text":69,"config":378},{"href":74,"dataGaName":69,"dataGaLocation":373},{"text":380,"config":381},"Code Suggestions (AI)",{"href":382,"dataGaName":380,"dataGaLocation":373},"/solutions/code-suggestions/",{"text":103,"config":384},{"href":105,"dataGaName":103,"dataGaLocation":373},{"text":386,"config":387},"GitLab on AWS",{"href":388,"dataGaName":386,"dataGaLocation":373},"/partners/technology-partners/aws/",{"text":390,"config":391},"GitLab on Google Cloud",{"href":392,"dataGaName":390,"dataGaLocation":373},"/partners/technology-partners/google-cloud-platform/",{"text":394,"config":395},"Why GitLab?",{"href":82,"dataGaName":394,"dataGaLocation":373},{"freeTrial":397,"mobileIcon":402,"desktopIcon":407,"secondaryButton":410},{"text":398,"config":399},"Start free trial",{"href":400,"dataGaName":45,"dataGaLocation":401},"https://gitlab.com/-/trials/new/","nav",{"altText":403,"config":404},"Gitlab Icon",{"src":405,"dataGaName":406,"dataGaLocation":401},"https://res.cloudinary.com/about-gitlab-com/image/upload/v1758203874/jypbw1jx72aexsoohd7x.svg","gitlab icon",{"altText":403,"config":408},{"src":409,"dataGaName":406,"dataGaLocation":401},"https://res.cloudinary.com/about-gitlab-com/image/upload/v1758203875/gs4c8p8opsgvflgkswz9.svg",{"text":411,"config":412},"Get Started",{"href":413,"dataGaName":414,"dataGaLocation":401},"https://gitlab.com/-/trial_registrations/new?glm_source=about.gitlab.com/get-started/","get started",{"freeTrial":416,"mobileIcon":420,"desktopIcon":422},{"text":417,"config":418},"Learn more about GitLab Duo",{"href":74,"dataGaName":419,"dataGaLocation":401},"gitlab duo",{"altText":403,"config":421},{"src":405,"dataGaName":406,"dataGaLocation":401},{"altText":403,"config":423},{"src":409,"dataGaName":406,"dataGaLocation":401},{"button":425,"mobileIcon":430,"desktopIcon":432},{"text":426,"config":427},"/switch",{"href":428,"dataGaName":429,"dataGaLocation":401},"#contact","switch",{"altText":403,"config":431},{"src":405,"dataGaName":406,"dataGaLocation":401},{"altText":403,"config":433},{"src":434,"dataGaName":406,"dataGaLocation":401},"https://res.cloudinary.com/about-gitlab-com/image/upload/v1773335277/ohhpiuoxoldryzrnhfrh.png",{"freeTrial":436,"mobileIcon":441,"desktopIcon":443},{"text":437,"config":438},"Back to pricing",{"href":182,"dataGaName":439,"dataGaLocation":401,"icon":440},"back to pricing","GoBack",{"altText":403,"config":442},{"src":405,"dataGaName":406,"dataGaLocation":401},{"altText":403,"config":444},{"src":409,"dataGaName":406,"dataGaLocation":401},{"title":446,"button":447,"config":452},"See how agentic AI transforms software delivery",{"text":448,"config":449},"Watch GitLab Transcend now",{"href":450,"dataGaName":451,"dataGaLocation":40},"/events/transcend/virtual/","transcend event",{"layout":453,"icon":454,"disabled":24},"release","AiStar",{"data":456},{"text":457,"source":458,"edit":464,"contribute":469,"config":474,"items":479,"minimal":686},"Git is a trademark of Software Freedom Conservancy and our use of 'GitLab' is under license",{"text":459,"config":460},"View page source",{"href":461,"dataGaName":462,"dataGaLocation":463},"https://gitlab.com/gitlab-com/marketing/digital-experience/about-gitlab-com/","page source","footer",{"text":465,"config":466},"Edit this page",{"href":467,"dataGaName":468,"dataGaLocation":463},"https://gitlab.com/gitlab-com/marketing/digital-experience/about-gitlab-com/-/blob/main/content/","web ide",{"text":470,"config":471},"Please contribute",{"href":472,"dataGaName":473,"dataGaLocation":463},"https://gitlab.com/gitlab-com/marketing/digital-experience/about-gitlab-com/-/blob/main/CONTRIBUTING.md/","please contribute",{"twitter":475,"facebook":476,"youtube":477,"linkedin":478},"https://twitter.com/gitlab","https://www.facebook.com/gitlab","https://www.youtube.com/channel/UCnMGQ8QHMAnVIsI3xJrihhg","https://www.linkedin.com/company/gitlab-com",[480,527,581,625,652],{"title":180,"links":481,"subMenu":496},[482,486,491],{"text":483,"config":484},"View plans",{"href":182,"dataGaName":485,"dataGaLocation":463},"view plans",{"text":487,"config":488},"Why Premium?",{"href":489,"dataGaName":490,"dataGaLocation":463},"/pricing/premium/","why premium",{"text":492,"config":493},"Why Ultimate?",{"href":494,"dataGaName":495,"dataGaLocation":463},"/pricing/ultimate/","why ultimate",[497],{"title":498,"links":499},"Contact Us",[500,503,505,507,512,517,522],{"text":501,"config":502},"Contact sales",{"href":49,"dataGaName":50,"dataGaLocation":463},{"text":356,"config":504},{"href":358,"dataGaName":359,"dataGaLocation":463},{"text":361,"config":506},{"href":363,"dataGaName":364,"dataGaLocation":463},{"text":508,"config":509},"Status",{"href":510,"dataGaName":511,"dataGaLocation":463},"https://status.gitlab.com/","status",{"text":513,"config":514},"Terms of use",{"href":515,"dataGaName":516,"dataGaLocation":463},"/terms/","terms of use",{"text":518,"config":519},"Privacy statement",{"href":520,"dataGaName":521,"dataGaLocation":463},"/privacy/","privacy statement",{"text":523,"config":524},"Cookie preferences",{"dataGaName":525,"dataGaLocation":463,"id":526,"isOneTrustButton":24},"cookie preferences","ot-sdk-btn",{"title":85,"links":528,"subMenu":537},[529,533],{"text":530,"config":531},"DevSecOps platform",{"href":67,"dataGaName":532,"dataGaLocation":463},"devsecops platform",{"text":534,"config":535},"AI-Assisted Development",{"href":74,"dataGaName":536,"dataGaLocation":463},"ai-assisted development",[538],{"title":539,"links":540},"Topics",[541,546,551,556,561,566,571,576],{"text":542,"config":543},"CICD",{"href":544,"dataGaName":545,"dataGaLocation":463},"/topics/ci-cd/","cicd",{"text":547,"config":548},"GitOps",{"href":549,"dataGaName":550,"dataGaLocation":463},"/topics/gitops/","gitops",{"text":552,"config":553},"DevOps",{"href":554,"dataGaName":555,"dataGaLocation":463},"/topics/devops/","devops",{"text":557,"config":558},"Version Control",{"href":559,"dataGaName":560,"dataGaLocation":463},"/topics/version-control/","version control",{"text":562,"config":563},"DevSecOps",{"href":564,"dataGaName":565,"dataGaLocation":463},"/topics/devsecops/","devsecops",{"text":567,"config":568},"Cloud Native",{"href":569,"dataGaName":570,"dataGaLocation":463},"/topics/cloud-native/","cloud native",{"text":572,"config":573},"AI for Coding",{"href":574,"dataGaName":575,"dataGaLocation":463},"/topics/devops/ai-for-coding/","ai for coding",{"text":577,"config":578},"Agentic AI",{"href":579,"dataGaName":580,"dataGaLocation":463},"/topics/agentic-ai/","agentic ai",{"title":582,"links":583},"Solutions",[584,586,588,593,597,600,604,607,609,612,615,620],{"text":127,"config":585},{"href":122,"dataGaName":127,"dataGaLocation":463},{"text":116,"config":587},{"href":99,"dataGaName":100,"dataGaLocation":463},{"text":589,"config":590},"Agile development",{"href":591,"dataGaName":592,"dataGaLocation":463},"/solutions/agile-delivery/","agile delivery",{"text":594,"config":595},"SCM",{"href":112,"dataGaName":596,"dataGaLocation":463},"source code management",{"text":542,"config":598},{"href":105,"dataGaName":599,"dataGaLocation":463},"continuous integration & delivery",{"text":601,"config":602},"Value stream management",{"href":155,"dataGaName":603,"dataGaLocation":463},"value stream management",{"text":547,"config":605},{"href":606,"dataGaName":550,"dataGaLocation":463},"/solutions/gitops/",{"text":165,"config":608},{"href":167,"dataGaName":168,"dataGaLocation":463},{"text":610,"config":611},"Small business",{"href":172,"dataGaName":173,"dataGaLocation":463},{"text":613,"config":614},"Public sector",{"href":177,"dataGaName":178,"dataGaLocation":463},{"text":616,"config":617},"Education",{"href":618,"dataGaName":619,"dataGaLocation":463},"/solutions/education/","education",{"text":621,"config":622},"Financial services",{"href":623,"dataGaName":624,"dataGaLocation":463},"/solutions/finance/","financial services",{"title":185,"links":626},[627,629,631,633,636,638,640,642,644,646,648,650],{"text":197,"config":628},{"href":199,"dataGaName":200,"dataGaLocation":463},{"text":202,"config":630},{"href":204,"dataGaName":205,"dataGaLocation":463},{"text":207,"config":632},{"href":209,"dataGaName":210,"dataGaLocation":463},{"text":212,"config":634},{"href":214,"dataGaName":635,"dataGaLocation":463},"docs",{"text":235,"config":637},{"href":237,"dataGaName":238,"dataGaLocation":463},{"text":230,"config":639},{"href":232,"dataGaName":233,"dataGaLocation":463},{"text":245,"config":641},{"href":247,"dataGaName":248,"dataGaLocation":463},{"text":253,"config":643},{"href":255,"dataGaName":256,"dataGaLocation":463},{"text":258,"config":645},{"href":260,"dataGaName":261,"dataGaLocation":463},{"text":263,"config":647},{"href":265,"dataGaName":266,"dataGaLocation":463},{"text":268,"config":649},{"href":270,"dataGaName":271,"dataGaLocation":463},{"text":273,"config":651},{"href":275,"dataGaName":276,"dataGaLocation":463},{"title":287,"links":653},[654,656,658,660,662,664,666,670,675,677,679,681],{"text":294,"config":655},{"href":296,"dataGaName":289,"dataGaLocation":463},{"text":299,"config":657},{"href":301,"dataGaName":302,"dataGaLocation":463},{"text":307,"config":659},{"href":309,"dataGaName":310,"dataGaLocation":463},{"text":312,"config":661},{"href":314,"dataGaName":315,"dataGaLocation":463},{"text":317,"config":663},{"href":319,"dataGaName":320,"dataGaLocation":463},{"text":322,"config":665},{"href":324,"dataGaName":325,"dataGaLocation":463},{"text":667,"config":668},"Sustainability",{"href":669,"dataGaName":667,"dataGaLocation":463},"/sustainability/",{"text":671,"config":672},"Diversity, inclusion and belonging (DIB)",{"href":673,"dataGaName":674,"dataGaLocation":463},"/diversity-inclusion-belonging/","Diversity, inclusion and belonging",{"text":327,"config":676},{"href":329,"dataGaName":330,"dataGaLocation":463},{"text":337,"config":678},{"href":339,"dataGaName":340,"dataGaLocation":463},{"text":342,"config":680},{"href":344,"dataGaName":345,"dataGaLocation":463},{"text":682,"config":683},"Modern Slavery Transparency Statement",{"href":684,"dataGaName":685,"dataGaLocation":463},"https://handbook.gitlab.com/handbook/legal/modern-slavery-act-transparency-statement/","modern slavery transparency statement",{"items":687},[688,691,694],{"text":689,"config":690},"Terms",{"href":515,"dataGaName":516,"dataGaLocation":463},{"text":692,"config":693},"Cookies",{"dataGaName":525,"dataGaLocation":463,"id":526,"isOneTrustButton":24},{"text":695,"config":696},"Privacy",{"href":520,"dataGaName":521,"dataGaLocation":463},[698],{"id":699,"title":9,"body":22,"config":700,"content":702,"description":22,"extension":21,"meta":706,"navigation":24,"path":707,"seo":708,"stem":709,"__hash__":710},"blogAuthors/en-us/blog/authors/jacob-vosmaer.yml",{"template":701},"BlogAuthor",{"name":9,"config":703},{"headshot":704,"ctfId":705},"https://res.cloudinary.com/about-gitlab-com/image/upload/v1749659488/Blog/Author%20Headshots/gitlab-logo-extra-whitespace.png","Jacob-Vosmaer",{},"/en-us/blog/authors/jacob-vosmaer",{},"en-us/blog/authors/jacob-vosmaer","8CR6ShwBsKxqGM4AAocQnzaLp61rqybzr4XSir9Y3Ag",[712,726,740],{"content":713,"config":724},{"title":714,"description":715,"authors":716,"heroImage":718,"date":719,"body":720,"category":11,"tags":721},"How to build CI/CD observability at scale","This practical guide to GitLab pipeline analytics helps self-managed users gain operational insights using Prometheus and Grafana.",[717],"Paul Meresanu","https://res.cloudinary.com/about-gitlab-com/image/upload/v1774465167/n5hlvrsrheadeccyr1oz.png","2026-04-28","CI/CD optimization starts with visibility. Building a successful DevOps platform at enterprise scale **should include** understanding pipeline performance, job execution patterns, and quantifiable operational insights — especially for organizations running GitLab self-managed instances.\n\nTo help GitLab customers maximize their platform investments, we developed the GitLab CI/CD Observability solution as part of our Platform Excellence program, which transforms raw pipeline metrics into actionable operational insights.\n\nA leading financial services organization partnered with GitLab's customer success architect to gain visibility into their GitLab self-managed deployment. Together, we implemented a containerized observability solution combining the open-source gitlab-ci-pipelines-exporter with enterprise-grade Prometheus and Grafana infrastructure.\n\nIn this article, you'll learn the challenges they faced managing pipelines at scale and how GitLab CI/CD Observability addressed them with a practical, end-to-end implementation.\n\n## The challenge: Measuring CI/CD performance\nBefore implementing any observability solution, define your measurement landscape:\n*   **What metrics matter?** Pipeline duration, job success rates, queue times, runner utilization\n*   **Who needs visibility?** Developers, DevOps engineers, platform teams, leadership\n*   **What decisions will this drive?** Infrastructure investment, bottleneck remediation, capacity planning\n\n## Solution architecture: A full set of dashboards for observability\nOnce deployed, the observability stack provides a set of Grafana dashboards that give real-time and historical visibility into your CI/CD platform. A typical deployment includes:\n*   **Pipeline Overview Dashboard:** A top-level view showing total pipeline runs, success/failure rates over time (as stacked bar or time-series charts), and average pipeline duration trends. Panels use color-coded status indicators (green for success, red for failure, amber for cancelled) so platform teams can spot degradation at a glance.\n*   **Job Performance Dashboard:** Drill-down panels showing individual job duration distributions (histogram), the top 10 slowest jobs by average duration, and job failure heatmaps by project and stage. This is where teams identify specific bottleneck jobs worth optimizing.\n*   **Runner & Infrastructure Dashboard:** Combines Node Exporter host metrics (CPU, memory, disk) with pipeline queue-time data to correlate infrastructure saturation with pipeline wait times. Useful for capacity planning decisions such as scaling runner pools or upgrading instance sizes.\n*   **Deployment Frequency Dashboard:** Tracks deployment count and deployment duration over time per environment, aligned with DORA metrics. Helps engineering leadership assess delivery throughput and environment drift (commits behind main).\n\nEach dashboard is provisioned automatically via Grafana's file-based provisioning, so it deploys consistently across environments. The dashboards can be further customized with Grafana variables to filter by project, ref/branch, or time range.\n\n![Solution architecture](https://res.cloudinary.com/about-gitlab-com/image/upload/v1777382608/Blog/Imported/blog-building-ci-cd-observability-stack-for-gitlab-self-managed/image1.png)\n\nThe solution requires two exporters:\n*   **Pipeline Exporter:** Collects CI/CD metrics via GitLab API (pipeline duration, job status, deployments)\n*   **Node Exporter:** Collects host-level metrics (CPU, memory, disk) for infrastructure correlation\n\n**Prerequisites:**\n*   GitLab Self-Managed Version 18.1+\n*   **Container orchestration platform:** A Kubernetes cluster (recommended for enterprise deployments) or a container runtime such as Docker/Podman for smaller scale or proof-of-concept environments. The primary deployment guide below targets Kubernetes; a Docker Compose alternative is provided in the appendix for local testing and evaluation\n*   GitLab Personal Access Token (**read_api** scope)\n\n## Kubernetes deployment (recommended)\nFor enterprise environments, deploy each component as a separate Deployment within a dedicated namespace. This approach integrates with existing cluster infrastructure, secrets management, and network policies.\n\n### 1. Create namespace and secret\n```bash\nkubectl create namespace gitlab-observability\n\n# Create the GitLab token secret (see Secrets Management section below\n# for enterprise-grade approaches using external secret operators)\nkubectl create secret generic gitlab-token \\\n  --from-literal=token=glpat-xxxxxxxxxxxx \\\n  -n gitlab-observability\n```\n\n\n### 2. Deploy the Pipeline Exporter\n```yaml\n# exporter-deployment.yaml\napiVersion: apps/v1\nkind: Deployment\nmetadata:\n  name: gitlab-ci-pipelines-exporter\n  namespace: gitlab-observability\nspec:\n  replicas: 1\n  selector:\n    matchLabels:\n      app: gitlab-ci-pipelines-exporter\n  template:\n    metadata:\n      labels:\n        app: gitlab-ci-pipelines-exporter\n    spec:\n      containers:\n        - name: exporter\n          image: mvisonneau/gitlab-ci-pipelines-exporter:latest\n          ports:\n            - containerPort: 8080\n          env:\n            - name: GCPE_GITLAB_TOKEN\n              valueFrom:\n                secretKeyRef:\n                  name: gitlab-token\n                  key: token\n            - name: GCPE_CONFIG\n              value: /etc/gcpe/config.yml\n          volumeMounts:\n            - name: config\n              mountPath: /etc/gcpe\n      volumes:\n        - name: config\n          configMap:\n            name: gcpe-config\n---\napiVersion: v1\nkind: Service\nmetadata:\n  name: gitlab-ci-pipelines-exporter\n  namespace: gitlab-observability\nspec:\n  selector:\n    app: gitlab-ci-pipelines-exporter\n  ports:\n    - port: 8080\n      targetPort: 8080\n```\n\n### 3. Deploy Node Exporter (DaemonSet)\n```yaml\n# node-exporter-daemonset.yaml\napiVersion: apps/v1\nkind: DaemonSet\nmetadata:\n  name: node-exporter\n  namespace: gitlab-observability\nspec:\n  selector:\n    matchLabels:\n      app: node-exporter\n  template:\n    metadata:\n      labels:\n        app: node-exporter\n    spec:\n      containers:\n        - name: node-exporter\n          image: prom/node-exporter:latest\n          ports:\n            - containerPort: 9100\n---\napiVersion: v1\nkind: Service\nmetadata:\n  name: node-exporter\n  namespace: gitlab-observability\nspec:\n  selector:\n    app: node-exporter\n  ports:\n    - port: 9100\n      targetPort: 9100\n```\n\n### 4. Deploy Prometheus\n```yaml\n# prometheus-deployment.yaml\napiVersion: apps/v1\nkind: Deployment\nmetadata:\n  name: prometheus\n  namespace: gitlab-observability\nspec:\n  replicas: 1\n  selector:\n    matchLabels:\n      app: prometheus\n  template:\n    metadata:\n      labels:\n        app: prometheus\n    spec:\n      containers:\n        - name: prometheus\n          image: prom/prometheus:latest\n          ports:\n            - containerPort: 9090\n          volumeMounts:\n            - name: config\n              mountPath: /etc/prometheus\n      volumes:\n        - name: config\n          configMap:\n            name: prometheus-config\n---\napiVersion: v1\nkind: Service\nmetadata:\n  name: prometheus\n  namespace: gitlab-observability\nspec:\n  selector:\n    app: prometheus\n  ports:\n    - port: 9090\n      targetPort: 9090\n```\n\n### 5. Deploy Grafana\nThe Grafana deployment below starts with authentication disabled (`GF_AUTH_ANONYMOUS_ENABLED: true`) for initial setup convenience.\n\n**This setting allows anyone with network access to view all dashboards without logging in.** For production deployments, remove this variable or set it to false and configure a proper authentication provider (LDAP, SAML/SSO, or OAuth) to restrict access to authorized users.\n```yaml\n# grafana-deployment.yaml\napiVersion: apps/v1\nkind: Deployment\nmetadata:\n  name: grafana\n  namespace: gitlab-observability\nspec:\n  replicas: 1\n  selector:\n    matchLabels:\n      app: grafana\n  template:\n    metadata:\n      labels:\n        app: grafana\n    spec:\n      containers:\n        - name: grafana\n          image: grafana/grafana:10.0.0\n          ports:\n            - containerPort: 3000\n          env:\n            # REMOVE or set to 'false' for production.\n            # When 'true', any user with network access can\n            # view dashboards without authentication.\n            - name: GF_AUTH_ANONYMOUS_ENABLED\n              value: 'true'\n          volumeMounts:\n            - name: dashboards-provider\n              mountPath: /etc/grafana/provisioning/dashboards\n            - name: datasources\n              mountPath: /etc/grafana/provisioning/datasources\n            - name: dashboards\n              mountPath: /var/lib/grafana/dashboards\n      volumes:\n        - name: dashboards-provider\n          configMap:\n            name: grafana-dashboards-provider\n        - name: datasources\n          configMap:\n            name: grafana-datasources\n        - name: dashboards\n          configMap:\n            name: grafana-dashboards\n---\napiVersion: v1\nkind: Service\nmetadata:\n  name: grafana\n  namespace: gitlab-observability\nspec:\n  selector:\n    app: grafana\n  ports:\n    - port: 3000\n      targetPort: 3000\n```\n\n### 6. Set network policy\nRestrict inter-pod traffic to only the required communication paths:\n```yaml\n# network-policy.yaml\napiVersion: networking.k8s.io/v1\nkind: NetworkPolicy\nmetadata:\n  name: observability-policy\n  namespace: gitlab-observability\nspec:\n  podSelector: {}\n  policyTypes:\n    - Ingress\n  ingress:\n    # Prometheus scrapes exporter and node-exporter\n    - from:\n        - podSelector:\n            matchLabels:\n              app: prometheus\n      ports:\n        - port: 8080\n        - port: 9100\n    # Grafana queries Prometheus\n    - from:\n        - podSelector:\n            matchLabels:\n              app: grafana\n      ports:\n        - port: 9090\n```\n\n### 7. Validate\n```bash\nkubectl get pods -n gitlab-observability\nkubectl port-forward svc/grafana 3000:3000 -n gitlab-observability\ncurl http://localhost:3000/api/health\n```\n\n## Configuration reference\n### Exporter configuration\n```yaml\n# gitlab-ci-pipelines-exporter.yml (ConfigMap: gcpe-config)\nlog:\n  level: info\ngitlab:\n  url: https://gitlab.your-domain.com\n  maximum_requests_per_second: 10\nproject_defaults:\n  pull:\n    pipeline:\n      jobs:\n        enabled: true\nwildcards:\n  - owner:\n      name: your-group-name\n      kind: group\n    archived: false\n```\n\n### Prometheus configuration\n```yaml\n# prometheus.yml (ConfigMap: prometheus-config)\nglobal:\n  scrape_interval: 15s\nscrape_configs:\n  - job_name: 'gitlab-ci-pipelines-exporter'\n    static_configs:\n      - targets: ['gitlab-ci-pipelines-exporter:8080']\n  - job_name: 'node-exporter'\n    static_configs:\n      - targets: ['node-exporter:9100']\n```\n\n### Grafana data sources\n```yaml\n# datasources.yml (ConfigMap: grafana-datasources)\napiVersion: 1\ndatasources:\n  - name: Prometheus\n    type: prometheus\n    access: proxy\n    url: http://prometheus:9090\n    isDefault: true\n# dashboards.yml (ConfigMap: grafana-dashboards-provider)\napiVersion: 1\nproviders:\n  - name: 'default'\n    folder: 'GitLab CI/CD'\n    type: file\n    options:\n      path: /var/lib/grafana/dashboards\n```\n\n## Key metrics\n### Pipeline Exporter metrics\n| Metric | Description |\n| :---- | :---- |\n| `gitlab_ci_pipeline_duration_seconds` | Pipeline execution time |\n| `gitlab_ci_pipeline_status` | Pipeline success/failure by project |\n| `gitlab_ci_pipeline_job_duration_seconds` | Individual job execution time |\n| `gitlab_ci_pipeline_job_status` | Job success/failure status |\n| `gitlab_ci_pipeline_job_artifact_size_bytes` | Artifact storage consumption |\n| `gitlab_ci_pipeline_coverage` | Code coverage percentage |\n| `gitlab_ci_environment_deployment_count` | Deployment frequency |\n| `gitlab_ci_environment_deployment_duration_seconds` | Deployment execution time |\n| `gitlab_ci_environment_behind_commits_count` | Environment drift from main |\n\n### Node Exporter metrics\n| Metric | Description |\n| :---- | :---- |\n| `node_cpu_seconds_total` | CPU utilization |\n| `node_memory_MemAvailable_bytes` | Available memory |\n| `node_filesystem_avail_bytes` | Disk space available |\n| `node_load1` | 1-minute load average |\n\n## Troubleshooting\n### Air-gapped Grafana plugin installation\nFor offline environments, install plugins manually. Example for Kubernetes:\n```bash\n# Copy plugin zip into the Grafana pod\nkubectl cp grafana-polystat-panel-2.1.16.zip \\\n  gitlab-observability/grafana-\u003Cpod-id>:/tmp/\n# Extract plugin\nkubectl exec -it -n gitlab-observability deploy/grafana -- \\\n  sh -c \"unzip /tmp/grafana-polystat-panel-2.1.16.zip -d /var/lib/grafana/plugins/\"\n# Restart Grafana pod\nkubectl rollout restart deployment/grafana -n gitlab-observability\n# Verify installation\nkubectl exec -it -n gitlab-observability deploy/grafana -- \\\n  ls -al /var/lib/grafana/plugins/\n```\n\n## Enterprise considerations\nFor regulated industries, ensure:\n*   **Token security:** Store GitLab Personal Access Tokens in a dedicated secrets manager rather than hardcoded in ConfigMaps. Enforce token rotation policies and limit scope to **read\\_api** only.\n*   **Network segmentation:** Deploy behind a reverse proxy with TLS termination. In Kubernetes, use an Ingress controller with automated certificate provisioning.\n*   **Authentication:** Configure Grafana with your organization's identity provider (SAML, LDAP, or OAuth/OIDC) to enforce role-based access control on dashboards.\n\n## Why GitLab?\nGitLab's API-first design enables custom observability solutions that complement native capabilities like Value Stream Analytics and DORA metrics. The open architecture allows organizations to integrate proven open-source tooling — like the gitlab-ci-pipelines-exporter — directly with their existing enterprise infrastructure, without disrupting established workflows.\n\nAs your observability maturity grows, GitLab's built-in Observability capabilities provide a natural next step — offering deeper, integrated visibility without additional tooling. Learn more about what's available natively in the platform for [GitLab Observability](https://docs.gitlab.com/operations/observability/observability/).\n",[103,722,723],"product","tutorial",{"featured":14,"template":15,"slug":725},"how-to-build-ci-cd-observability-at-scale",{"content":727,"config":738},{"body":728,"title":729,"description":730,"authors":731,"heroImage":733,"date":734,"category":11,"tags":735},"Most CI/CD tools can run a build and ship a deployment. Where they diverge is what happens when your delivery needs get real: a monorepo with a dozen services, microservices spread across multiple repositories, deployments to dozens of environments, or a platform team trying to enforce standards without becoming a bottleneck.\n  \nGitLab's pipeline execution model was designed for that complexity. Parent-child pipelines, DAG execution, dynamic pipeline generation, multi-project triggers, merge request pipelines with merged results, and CI/CD Components each solve a distinct class of problems. Because they compose, understanding the full model unlocks something more than a faster pipeline. In this article, you'll learn about the five patterns where that model stands out, each mapped to a real engineering scenario with the configuration to match.\n  \nThe configs below are illustrative. The scripts use echo commands to keep the signal-to-noise ratio low. Swap them out for your actual build, test, and deploy steps and they are ready to use.\n\n\n## 1. Monorepos: Parent-child pipelines + DAG execution\n\n\nThe problem: Your monorepo has a frontend, a backend, and a docs site. Every commit triggers a full rebuild of everything, even when only a README changed.\n\n\nGitLab solves this with two complementary features: [parent-child pipelines](https://docs.gitlab.com/ci/pipelines/downstream_pipelines/#parent-child-pipelines) (which let a top-level pipeline spawn isolated sub-pipelines) and [DAG execution via `needs`](https://docs.gitlab.com/ci/yaml/#needs) (which breaks rigid stage-by-stage ordering and lets jobs start the moment their dependencies finish).\n\n\nA parent pipeline detects what changed and triggers only the relevant child pipelines:\n\n```yaml\n# .gitlab-ci.yml\nstages:\n  - trigger\n\ntrigger-services:\n  stage: trigger\n  trigger:\n    include:\n      - local: '.gitlab/ci/api-service.yml'\n      - local: '.gitlab/ci/web-service.yml'\n      - local: '.gitlab/ci/worker-service.yml'\n    strategy: depend\n```\n\n\nEach child pipeline is a fully independent pipeline with its own stages, jobs, and artifacts. The parent waits for all of them via [strategy: depend](https://docs.gitlab.com/ci/pipelines/downstream_pipelines/#wait-for-downstream-pipeline-to-complete) so you get a single green/red signal at the top level, with full drill-down into each service's pipeline. This organizational separation is the bigger win for large teams: each service owns its pipeline config, changes in one cannot break another, and the complexity stays manageable as the repo grows.\n\n\nOne thing worth knowing: when you pass [multiple files to a single `trigger: include:`](https://docs.gitlab.com/ci/pipelines/downstream_pipelines/#combine-multiple-child-pipeline-configuration-files), GitLab merges them into a single child pipeline configuration. This means jobs defined across those files share the same pipeline context and can reference each other with `needs:`, which is what makes the DAG optimization possible. If you split them into separate trigger jobs instead, each would be its own isolated pipeline and cross-file `needs:` references would not work.\n\n\nCombine this with `needs:` inside each child pipeline and you get DAG execution. Your integration tests can start the moment the build finishes, without waiting for other jobs in the same stage.\n\n```yaml\n# .gitlab/ci/api-service.yml\nstages:\n  - build\n  - test\n\nbuild-api:\n  stage: build\n  script:\n    - echo \"Building API service\"\n\ntest-api:\n  stage: test\n  needs: [build-api]\n  script:\n    - echo \"Running API tests\"\n```\n\n\nWhy it matters: Teams with large monorepos typically report significant reductions in pipeline runtime after switching to DAG execution, since jobs no longer wait on unrelated work in the same stage. Parent-child pipelines add the organizational layer that keeps the configuration maintainable as the repo and team grow.\n\n![Local downstream pipelines](https://res.cloudinary.com/about-gitlab-com/image/upload/v1775738759/Blog/Imported/hackathon-fake-blog-post-s/image3_vwj3rz.png \"Local downstream pipelines\")\n\n## 2. Microservices: Cross-repo, multi-project pipelines\n\n\nThe problem: Your frontend lives in one repo, your backend in another. When the frontend team ships a change, they have no visibility into whether it broke the backend integration and vice versa.\n\n\nGitLab's [multi-project pipelines](https://docs.gitlab.com/ci/pipelines/downstream_pipelines/#multi-project-pipelines) let one project trigger a pipeline in a completely separate project and wait for the result. The triggering project gets a linked downstream pipeline right in its own pipeline view.\n\n\nThe frontend pipeline builds an API contract artifact and publishes it, then triggers the backend pipeline. The backend fetches that artifact directly using the [Jobs API](https://docs.gitlab.com/api/jobs/#download-a-single-artifact-file-from-specific-tag-or-branch) and validates it before allowing anything to proceed. If a breaking change is detected, the backend pipeline fails and the frontend pipeline fails with it.\n\n```yaml\n# frontend repo: .gitlab-ci.yml\nstages:\n  - build\n  - test\n  - trigger-backend\n\nbuild-frontend:\n  stage: build\n  script:\n    - echo \"Building frontend and generating API contract...\"\n    - mkdir -p dist\n    - |\n      echo '{\n        \"api_version\": \"v2\",\n        \"breaking_changes\": false\n      }' > dist/api-contract.json\n    - cat dist/api-contract.json\n  artifacts:\n    paths:\n      - dist/api-contract.json\n    expire_in: 1 hour\n\ntest-frontend:\n  stage: test\n  script:\n    - echo \"All frontend tests passed!\"\n\ntrigger-backend-pipeline:\n  stage: trigger-backend\n  trigger:\n    project: my-org/backend-service\n    branch: main\n    strategy: depend\n  rules:\n    - if: $CI_COMMIT_BRANCH == \"main\"\n```\n\n```yaml\n# backend repo: .gitlab-ci.yml\nstages:\n  - build\n  - test\n\nbuild-backend:\n  stage: build\n  script:\n    - echo \"All backend tests passed!\"\n\nintegration-test:\n  stage: test\n  rules:\n    - if: $CI_PIPELINE_SOURCE == \"pipeline\"\n  script:\n    - echo \"Fetching API contract from frontend...\"\n    - |\n      curl --silent --fail \\\n        --header \"JOB-TOKEN: $CI_JOB_TOKEN\" \\\n        --output api-contract.json \\\n        \"${CI_API_V4_URL}/projects/${FRONTEND_PROJECT_ID}/jobs/artifacts/main/raw/dist/api-contract.json?job=build-frontend\"\n    - cat api-contract.json\n    - |\n      if grep -q '\"breaking_changes\": true' api-contract.json; then\n        echo \"FAIL: Breaking API changes detected - backend integration blocked!\"\n        exit 1\n      fi\n      echo \"PASS: API contract is compatible!\"\n```\n\n\nA few things worth noting in this config. The `integration-test` job uses `$CI_PIPELINE_SOURCE == \"pipeline\"` to ensure it only runs when triggered by an upstream pipeline, not on a standalone push to the backend repo. The frontend project ID is referenced via `$FRONTEND_PROJECT_ID`, which should be set as a [CI/CD variable](https://docs.gitlab.com/ci/variables/) in the backend project settings to avoid hardcoding it.\n\n\nWhy it matters: Cross-service breakage that previously surfaced in production gets caught in the pipeline instead. The dependency between services stops being invisible and becomes something teams can see, track, and act on.\n\n\n![Cross-project pipelines](https://res.cloudinary.com/about-gitlab-com/image/upload/v1775738762/Blog/Imported/hackathon-fake-blog-post-s/image4_h6mfsb.png \"Cross-project pipelines\")\n\n\n## 3. Multi-tenant / matrix deployments: Dynamic child pipelines\n\n\nThe problem: You deploy the same application to 15 customer environments, or three cloud regions, or dev/staging/prod. Updating a deploy stage across all of them one by one is the kind of work that leads to configuration drift. Writing a separate pipeline for each environment is unmaintainable from day one.\n\n\nGitLab's [dynamic child pipelines](https://docs.gitlab.com/ci/pipelines/downstream_pipelines/#dynamic-child-pipelines) let you generate a pipeline at runtime. A job runs a script that produces a YAML file, and that YAML becomes the pipeline for the next stage. The pipeline structure itself becomes data.\n\n\n```yaml\n# .gitlab-ci.yml\nstages:\n  - generate\n  - trigger-environments\n\ngenerate-config:\n  stage: generate\n  script:\n    - |\n      # ENVIRONMENTS can be passed as a CI variable or read from a config file.\n      # Default to dev, staging, prod if not set.\n      ENVIRONMENTS=${ENVIRONMENTS:-\"dev staging prod\"}\n      for ENV in $ENVIRONMENTS; do\n        cat > ${ENV}-pipeline.yml \u003C\u003C EOF\n      stages:\n        - deploy\n        - verify\n      deploy-${ENV}:\n        stage: deploy\n        script:\n          - echo \"Deploying to ${ENV} environment\"\n      verify-${ENV}:\n        stage: verify\n        script:\n          - echo \"Running smoke tests on ${ENV}\"\n      EOF\n      done\n  artifacts:\n    paths:\n      - \"*.yml\"\n    exclude:\n      - \".gitlab-ci.yml\"\n\n.trigger-template:\n  stage: trigger-environments\n  trigger:\n    strategy: depend\n\ntrigger-dev:\n  extends: .trigger-template\n  trigger:\n    include:\n      - artifact: dev-pipeline.yml\n        job: generate-config\n\ntrigger-staging:\n  extends: .trigger-template\n  needs: [trigger-dev]\n  trigger:\n    include:\n      - artifact: staging-pipeline.yml\n        job: generate-config\n\ntrigger-prod:\n  extends: .trigger-template\n  needs: [trigger-staging]\n  trigger:\n    include:\n      - artifact: prod-pipeline.yml\n        job: generate-config\n  when: manual\n```\n\n\nThe generation script loops over an `ENVIRONMENTS` variable rather than hardcoding each environment separately. Pass in a different list via a CI variable or read it from a config file and the pipeline adapts without touching the YAML. The trigger jobs use [extends:](https://docs.gitlab.com/ci/yaml/#extends) to inherit shared configuration from `.trigger-template`, so `strategy: depend` is defined once rather than repeated on every trigger job. Add a new environment by updating the variable, not by duplicating pipeline config. Add [when: manual](https://docs.gitlab.com/ci/yaml/#when) to the production trigger and you get a promotion gate baked right into the pipeline graph.\n\n\nWhy it matters: SaaS companies and platform teams use this pattern to manage dozens of environments without duplicating pipeline logic. The pipeline structure itself stays lean as the deployment matrix grows.\n\n\n![Dynamic pipeline](https://res.cloudinary.com/about-gitlab-com/image/upload/v1775738765/Blog/Imported/hackathon-fake-blog-post-s/image7_wr0kx2.png \"Dynamic pipeline\")\n\n\n## 4. MR-first delivery: Merge request pipelines, merged results, and workflow routing\n\n\nThe problem: Your pipeline runs on every push to every branch. Expensive tests run on feature branches that will never merge. Meanwhile, you have no guarantee that what you tested is actually what will land on `main` after a merge.\n\n\nGitLab has three interlocking features that solve this together:\n\n\n*   [Merge request pipelines](https://docs.gitlab.com/ci/pipelines/merge_request_pipelines/) run only when a merge request exists, not on every branch push. This alone eliminates a significant amount of wasted compute.\n\n*   [Merged results pipelines](https://docs.gitlab.com/ci/pipelines/merged_results_pipelines/) go further. GitLab creates a temporary merge commit (your branch plus the current target branch) and runs the pipeline against that. You are testing what will actually exist after the merge, not just your branch in isolation.\n\n*   [Workflow rules](https://docs.gitlab.com/ci/yaml/workflow/) let you define exactly which pipeline type runs under which conditions and suppress everything else. The `$CI_OPEN_MERGE_REQUESTS` guard below prevents duplicate pipelines firing for both a branch and its open MR simultaneously.\n\n\nWith those three working together, here is what a tiered pipeline looks like:\n\n```yaml\n# .gitlab-ci.yml\nworkflow:\n  rules:\n    - if: $CI_PIPELINE_SOURCE == \"merge_request_event\"\n    - if: $CI_COMMIT_BRANCH && $CI_OPEN_MERGE_REQUESTS\n      when: never\n    - if: $CI_COMMIT_BRANCH\n    - if: $CI_PIPELINE_SOURCE == \"schedule\"\n\nstages:\n  - fast-checks\n  - expensive-tests\n  - deploy\n\nlint-code:\n  stage: fast-checks\n  script:\n    - echo \"Running linter\"\n  rules:\n    - if: $CI_PIPELINE_SOURCE == \"push\"\n    - if: $CI_PIPELINE_SOURCE == \"merge_request_event\"\n    - if: $CI_COMMIT_BRANCH == \"main\"\n\nunit-tests:\n  stage: fast-checks\n  script:\n    - echo \"Running unit tests\"\n  rules:\n    - if: $CI_PIPELINE_SOURCE == \"push\"\n    - if: $CI_PIPELINE_SOURCE == \"merge_request_event\"\n    - if: $CI_COMMIT_BRANCH == \"main\"\n\nintegration-tests:\n  stage: expensive-tests\n  script:\n    - echo \"Running integration tests (15 min)\"\n  rules:\n    - if: $CI_PIPELINE_SOURCE == \"merge_request_event\"\n    - if: $CI_COMMIT_BRANCH == \"main\"\n\ne2e-tests:\n  stage: expensive-tests\n  script:\n    - echo \"Running E2E tests (30 min)\"\n  rules:\n    - if: $CI_PIPELINE_SOURCE == \"merge_request_event\"\n    - if: $CI_COMMIT_BRANCH == \"main\"\n\nnightly-comprehensive-scan:\n  stage: expensive-tests\n  script:\n    - echo \"Running full nightly suite (2 hours)\"\n  rules:\n    - if: $CI_PIPELINE_SOURCE == \"schedule\"\n\ndeploy-production:\n  stage: deploy\n  script:\n    - echo \"Deploying to production\"\n  rules:\n    - if: $CI_COMMIT_BRANCH == \"main\"\n      when: manual\n```\n\nWith this setup, the pipeline behaves differently depending on context. A push to a feature branch with no open MR runs lint and unit tests only. Once an MR is opened, the workflow rules switch from a branch pipeline to an MR pipeline, and the full integration and E2E suite runs against the merged result. Merging to `main` queues a manual production deployment. A nightly schedule runs the comprehensive scan once, not on every commit.\n\n\nWhy it matters: Teams routinely cut CI costs significantly with this pattern, not by running fewer tests, but by running the right tests at the right time. Merged results pipelines catch the class of bugs that only appear after a merge, before they ever reach `main`.\n\n\n![Conditional pipelines (within a branch with no MR)](https://res.cloudinary.com/about-gitlab-com/image/upload/v1775738768/Blog/Imported/hackathon-fake-blog-post-s/image6_dnfcny.png \"Conditional pipelines (within a branch with no MR)\")\n\n\n\n![Conditional pipelines (within an MR)](https://res.cloudinary.com/about-gitlab-com/image/upload/v1775738772/Blog/Imported/hackathon-fake-blog-post-s/image1_wyiafu.png \"Conditional pipelines (within an MR)\")\n\n\n\n![Conditional pipelines (on the main branch)](https://res.cloudinary.com/about-gitlab-com/image/upload/v1775738774/Blog/Imported/hackathon-fake-blog-post-s/image5_r6lkfd.png \"Conditional pipelines (on the main branch)\")\n\n## 5. Governed pipelines: CI/CD Components\n\n\nThe problem: Your platform team has defined the right way to build, test, and deploy. But every team has their own `.gitlab-ci.yml` with subtle variations. Security scanning gets skipped. Deployment standards drift. Audits are painful.\n\n\nGitLab [CI/CD Components](https://docs.gitlab.com/ci/components/) let platform teams publish versioned, reusable pipeline building blocks. Application teams consume them with a single `include:` line and optional inputs — no copy-paste, no drift. Components are discoverable through the [CI/CD Catalog](https://docs.gitlab.com/ci/components/#cicd-catalog), which means teams can find and adopt approved building blocks without needing to go through the platform team directly.\n\n\nHere is a component definition from a shared library:\n\n```yaml\n# templates/deploy.yml\nspec:\n  inputs:\n    stage:\n      default: deploy\n    environment:\n      default: production\n---\ndeploy-job:\n  stage: $[[ inputs.stage ]]\n  script:\n    - echo \"Deploying $APP_NAME to $[[ inputs.environment ]]\"\n    - echo \"Deploy URL: $DEPLOY_URL\"\n  environment:\n    name: $[[ inputs.environment ]]\n```\nAnd here is how an application team consumes it:\n\n```yaml\n# Application repo: .gitlab-ci.yml\nvariables:\n  APP_NAME: \"my-awesome-app\"\n  DEPLOY_URL: \"https://api.example.com\"\n\ninclude:\n  - component: gitlab.com/my-org/component-library/build@v1.0.6\n  - component: gitlab.com/my-org/component-library/test@v1.0.6\n  - component: gitlab.com/my-org/component-library/deploy@v1.0.6\n    inputs:\n      environment: staging\n\nstages:\n  - build\n  - test\n  - deploy\n```\n\nThree lines of `include:` replace hundreds of lines of duplicated YAML. The platform team can push a security fix to `v1.0.7` and teams opt in on their own schedule — or the platform team can pin everyone to a minimum version. Either way, one change propagates everywhere instead of needing to be applied repo by repo.\n\n\nPair this with [resource groups](https://docs.gitlab.com/ci/resource_groups/) to prevent concurrent deployments to the same environment, and [protected environments](https://docs.gitlab.com/ci/environments/protected_environments/) to enforce approval gates - and you have a governed delivery platform where compliance is the default, not the exception.\n\n\nWhy it matters: This is the pattern that makes GitLab CI/CD scale across hundreds of teams. Platform engineering teams enforce compliance without becoming a bottleneck. Application teams get a fast path to a working pipeline without reinventing the wheel.\n\n\n![Component pipeline (imported jobs)](https://res.cloudinary.com/about-gitlab-com/image/upload/v1775738776/Blog/Imported/hackathon-fake-blog-post-s/image2_pizuxd.png \"Component pipeline (imported jobs)\")\n\n## Putting it all together\n\nNone of these features exist in isolation. The reason GitLab's pipeline model is worth understanding deeply is that these primitives compose:\n\n*   A monorepo uses parent-child pipelines, and each child uses DAG execution\n\n*   A microservices platform uses multi-project pipelines, and each project uses MR pipelines with merged results\n\n*   A governed platform uses CI/CD components to standardize the patterns above across every team\n\n\nMost teams discover one of these features when they hit a specific pain point. The ones who invest in understanding the full model end up with a delivery system that actually reflects how their engineering organization works, not a pipeline that fights it.\n\n## Other patterns worth exploring\n\n\nThe five patterns above cover the most common structural pain points, but GitLab's pipeline model goes further. A few others worth looking into as your needs grow:\n\n\n*   [Review apps with dynamic environments](https://docs.gitlab.com/ci/environments/) let you spin up a live preview for every feature branch and tear it down automatically when the MR closes. Useful for teams doing frontend work or API changes that need stakeholder sign-off before merging.\n\n*   [Caching and artifact strategies](https://docs.gitlab.com/ci/caching/) are often the fastest way to cut pipeline runtime after the structural work is done. Structuring `cache:` keys around dependency lockfiles and being deliberate about what gets passed between jobs with [artifacts:](https://docs.gitlab.com/ci/yaml/#artifacts) can make a significant difference without changing your pipeline shape at all.\n\n*   [Scheduled and API-triggered pipelines](https://docs.gitlab.com/ci/pipelines/schedules/) are worth knowing about because not everything should run on a code push. Nightly security scans, compliance reports, and release automation are better modeled as scheduled or [API-triggered](https://docs.gitlab.com/ci/triggers/) pipelines with `$CI_PIPELINE_SOURCE` routing the right jobs for each context.\n\n## How to get started\n\nModern software delivery is complex. Teams are managing monorepos with dozens of services, coordinating across multiple repositories, deploying to many environments at once, and trying to keep standards consistent as organizations grow. GitLab's pipeline model was built with all of that in mind.\n\nWhat makes it worth investing time in is how well the pieces fit together. Parent-child pipelines bring structure to large codebases. Multi-project pipelines make cross-team dependencies visible and testable. Dynamic pipelines turn environment management into something that scales gracefully. MR-first delivery with merged results ensures confidence at every step of the review process. And CI/CD Components give platform teams a way to share best practices across an entire organization without becoming a bottleneck.\n\nEach of these features is powerful on its own, and even more so when combined. GitLab gives you the building blocks to design a delivery system that fits how your team actually works, and grows with you as your needs evolve.\n\n> [Start a free trial of GitLab Ultimate](https://about.gitlab.com/free-trial/) to use pipeline logic today.\n\n## Read more\n\n*   [Variable and artifact sharing in GitLab parent-child pipelines](https://about.gitlab.com/blog/variable-and-artifact-sharing-in-gitlab-parent-child-pipelines/)\n*   [CI/CD inputs: Secure and preferred method to pass parameters to a pipeline](https://about.gitlab.com/blog/ci-cd-inputs-secure-and-preferred-method-to-pass-parameters-to-a-pipeline/)\n*   [Tutorial: How to set up your first GitLab CI/CD component](https://about.gitlab.com/blog/tutorial-how-to-set-up-your-first-gitlab-ci-cd-component/)\n*   [How to include file references in your CI/CD components](https://about.gitlab.com/blog/how-to-include-file-references-in-your-ci-cd-components/)\n*   [FAQ: GitLab CI/CD Catalog](https://about.gitlab.com/blog/faq-gitlab-ci-cd-catalog/)\n*   [Building a GitLab CI/CD pipeline for a monorepo the easy way](https://about.gitlab.com/blog/building-a-gitlab-ci-cd-pipeline-for-a-monorepo-the-easy-way/)\n*   [A CI/CD component builder's journey](https://about.gitlab.com/blog/a-ci-component-builders-journey/)\n*   [CI/CD Catalog goes GA: No more building pipelines from scratch](https://about.gitlab.com/blog/ci-cd-catalog-goes-ga-no-more-building-pipelines-from-scratch/)","5 ways GitLab pipeline logic solves real engineering problems","Learn how to scale CI/CD with composable patterns for monorepos, microservices, environments, and governance.",[732],"Omid Khan","https://res.cloudinary.com/about-gitlab-com/image/upload/v1772721753/frfsm1qfscwrmsyzj1qn.png","2026-04-09",[103,736,723,737],"DevOps platform","features",{"featured":24,"template":15,"slug":739},"5-ways-gitlab-pipeline-logic-solves-real-engineering-problems",{"content":741,"config":750},{"title":742,"description":743,"authors":744,"heroImage":746,"date":747,"body":748,"category":11,"tags":749},"How to use GitLab Container Virtual Registry with Docker Hardened Images","Learn how to simplify container image management with this step-by-step guide.",[745],"Tim Rizzi","https://res.cloudinary.com/about-gitlab-com/image/upload/v1772111172/mwhgbjawn62kymfwrhle.png","2026-03-12","If you're a platform engineer, you've probably had this conversation:\n  \n*\"Security says we need to use hardened base images.\"*\n\n*\"Great, where do I configure credentials for yet another registry?\"*\n\n*\"Also, how do we make sure everyone actually uses them?\"*\n\nOr this one:\n\n*\"Why are our builds so slow?\"*\n\n*\"We're pulling the same 500MB image from Docker Hub in every single job.\"*\n\n*\"Can't we just cache these somewhere?\"*\n\nI've been working on [Container Virtual Registry](https://docs.gitlab.com/user/packages/virtual_registry/container/) at GitLab specifically to solve these problems. It's a pull-through cache that sits in front of your upstream registries — Docker Hub, dhi.io (Docker Hardened Images), MCR, and Quay — and gives your teams a single endpoint to pull from. Images get cached on the first pull. Subsequent pulls come from the cache. Your developers don't need to know or care which upstream a particular image came from.\n\nThis article shows you how to set up Container Virtual Registry, specifically with Docker Hardened Images in mind, since that's a combination that makes a lot of sense for teams concerned about security and not making their developers' lives harder.\n\n## What problem are we actually solving?\n\nThe Platform teams I usually talk to manage container images across three to five registries:\n\n* **Docker Hub** for most base images\n* **dhi.io** for Docker Hardened Images (security-conscious workloads)\n* **MCR** for .NET and Azure tooling\n* **Quay.io** for Red Hat ecosystem stuff\n* **Internal registries** for proprietary images\n\nEach one has its own:\n\n* Authentication mechanism\n* Network latency characteristics\n* Way of organizing image paths\n\nYour CI/CD configs end up littered with registry-specific logic. Credential management becomes a project unto itself. And every pipeline job pulls the same base images over the network, even though they haven't changed in weeks.\n\nContainer Virtual Registry consolidates this. One registry URL. One authentication flow (GitLab's). Cached images are served from GitLab's infrastructure rather than traversing the internet each time.\n\n## How it works\n\nThe model is straightforward:\n\n```text\nYour pipeline pulls:\n  gitlab.com/virtual_registries/container/1000016/python:3.13\n\nVirtual registry checks:\n  1. Do I have this cached? → Return it\n  2. No? → Fetch from upstream, cache it, return it\n\n```\n\nYou configure upstreams in priority order. When a pull request comes in, the virtual registry checks each upstream until it finds the image. The result gets cached for a configurable period (default 24 hours).\n\n```text\n┌─────────────────────────────────────────────────────────┐\n│                    CI/CD Pipeline                       │\n│                          │                              │\n│                          ▼                              │\n│   gitlab.com/virtual_registries/container/\u003Cid>/image   │\n└─────────────────────────────────────────────────────────┘\n                           │\n                           ▼\n┌─────────────────────────────────────────────────────────┐\n│            Container Virtual Registry                   │\n│                                                         │\n│  Upstream 1: Docker Hub ────────────────┐               │\n│  Upstream 2: dhi.io (Hardened) ────────┐│               │\n│  Upstream 3: MCR ─────────────────────┐││               │\n│  Upstream 4: Quay.io ────────────────┐│││               │\n│                                      ││││               │\n│                    ┌─────────────────┴┴┴┴──┐            │\n│                    │        Cache          │            │\n│                    │  (manifests + layers) │            │\n│                    └───────────────────────┘            │\n└─────────────────────────────────────────────────────────┘\n```\n\n## Why this matters for Docker Hardened Images\n\n[Docker Hardened Images](https://docs.docker.com/dhi/) are great because of the minimal attack surface, near-zero CVEs, proper software bills of materials (SBOMs), and SLSA provenance. If you're evaluating base images for security-sensitive workloads, they should be on your list.\n\nBut adopting them creates the same operational friction as any new registry:\n\n* **Credential distribution**: You need to get Docker credentials to every system that pulls images from dhi.io.\n* **CI/CD changes**: Every pipeline needs to be updated to authenticate with dhi.io.\n* **Developer friction**: People need to remember to use the hardened variants.\n* **Visibility gap**: It's difficult to tell if teams are actually using hardened images vs. regular ones.\n\nVirtual registry addresses each of these:\n\n**Single credential**: Teams authenticate to GitLab. The virtual registry handles upstream authentication. You configure Docker credentials once, at the registry level, and they apply to all pulls.\n\n**No CI/CD changes per-team**: Point pipelines at your virtual registry. Done. The upstream configuration is centralized.\n\n**Gradual adoption**: Since images get cached with their full path, you can see in the cache what's being pulled. If someone's pulling `library/python:3.11` instead of the hardened variant, you'll know.\n\n**Audit trail**: The cache shows you exactly which images are in active use. Useful for compliance, useful for understanding what your fleet actually depends on.\n\n## Setting it up\n\nHere's a real setup using the Python client from this demo project.\n\n### Create the virtual registry\n\n```python\nfrom virtual_registry_client import VirtualRegistryClient\n\nclient = VirtualRegistryClient()\n\nregistry = client.create_virtual_registry(\n    group_id=\"785414\",  # Your top-level group ID\n    name=\"platform-images\",\n    description=\"Cached container images for platform teams\"\n)\n\nprint(f\"Registry ID: {registry['id']}\")\n# You'll need this ID for the pull URL\n```\n\n### Add Docker Hub as an upstream\n\nFor official images like Alpine, Python, etc.:\n\n```python\ndocker_upstream = client.create_upstream(\n    registry_id=registry['id'],\n    url=\"https://registry-1.docker.io\",\n    name=\"Docker Hub\",\n    cache_validity_hours=24\n)\n```\n\n### Add Docker Hardened Images (dhi.io)\n\nDocker Hardened Images are hosted on `dhi.io`, a separate registry that requires authentication:\n\n```python\ndhi_upstream = client.create_upstream(\n    registry_id=registry['id'],\n    url=\"https://dhi.io\",\n    name=\"Docker Hardened Images\",\n    username=\"your-docker-username\",\n    password=\"your-docker-access-token\",\n    cache_validity_hours=24\n)\n```\n\n### Add other upstreams\n\n```python\n# MCR for .NET teams\nclient.create_upstream(\n    registry_id=registry['id'],\n    url=\"https://mcr.microsoft.com\",\n    name=\"Microsoft Container Registry\",\n    cache_validity_hours=48\n)\n\n# Quay for Red Hat stuff\nclient.create_upstream(\n    registry_id=registry['id'],\n    url=\"https://quay.io\",\n    name=\"Quay.io\",\n    cache_validity_hours=24\n)\n```\n\n### Update your CI/CD\n\nHere's a `.gitlab-ci.yml` that pulls through the virtual registry:\n\n```yaml\nvariables:\n  VIRTUAL_REGISTRY_ID: \u003Cyour_virtual_registry_ID>\n\n  \nbuild:\n  image: docker:24\n  services:\n    - docker:24-dind\n  before_script:\n    # Authenticate to GitLab (which handles upstream auth for you)\n    - echo \"${CI_JOB_TOKEN}\" | docker login -u gitlab-ci-token --password-stdin gitlab.com\n  script:\n    # All of these go through your single virtual registry\n    \n    # Official Docker Hub images (use library/ prefix)\n    - docker pull gitlab.com/virtual_registries/container/${VIRTUAL_REGISTRY_ID}/library/alpine:latest\n    \n    # Docker Hardened Images from dhi.io (no prefix needed)\n    - docker pull gitlab.com/virtual_registries/container/${VIRTUAL_REGISTRY_ID}/python:3.13\n    \n    # .NET from MCR\n    - docker pull gitlab.com/virtual_registries/container/${VIRTUAL_REGISTRY_ID}/dotnet/sdk:8.0\n```\n\n### Image path formats\n\nDifferent registries use different path conventions:\n\n| Registry | Pull URL Example |\n|----------|------------------|\n| Docker Hub (official) | `.../library/python:3.11-slim` |\n| Docker Hardened Images (dhi.io) | `.../python:3.13` |\n| MCR | `.../dotnet/sdk:8.0` |\n| Quay.io | `.../prometheus/prometheus:latest` |\n\n### Verify it's working\n\nAfter some pulls, check your cache:\n\n```python\nupstreams = client.list_registry_upstreams(registry['id'])\nfor upstream in upstreams:\n    entries = client.list_cache_entries(upstream['id'])\n    print(f\"{upstream['name']}: {len(entries)} cached entries\")\n\n```\n\n## What the numbers look like\n\nI ran tests pulling images through the virtual registry:\n\n| Metric | Without Cache | With Warm Cache |\n|--------|---------------|-----------------|\n| Pull time (Alpine) | 10.3s | 4.2s |\n| Pull time (Python 3.13 DHI) | 11.6s | ~4s |\n| Network roundtrips to upstream | Every pull | Cache misses only |\n\n\n\n\nThe first pull is the same speed (it has to fetch from upstream). Every pull after that, for the cache validity period, comes straight from GitLab's storage. No network hop to Docker Hub, dhi.io, MCR, or wherever the image lives.\n\nFor a team running hundreds of pipeline jobs per day, that's hours of cumulative build time saved.\n\n## Practical considerations\nHere are some considerations to keep in mind:\n\n### Cache validity\n\n24 hours is the default. For security-sensitive images where you want patches quickly, consider 12 hours or less:\n\n```python\nclient.create_upstream(\n    registry_id=registry['id'],\n    url=\"https://dhi.io\",\n    name=\"Docker Hardened Images\",\n    username=\"your-username\",\n    password=\"your-token\",\n    cache_validity_hours=12\n)\n```\n\nFor stable, infrequently-updated images (like specific version tags), longer validity is fine.\n\n### Upstream priority\n\nUpstreams are checked in order. If you have images with the same name on different registries, the first matching upstream wins.\n\n### Limits\n\n* Maximum of 20 virtual registries per group\n* Maximum of 20 upstreams per virtual registry\n\n## Configuration via UI\n\nYou can also configure virtual registries and upstreams directly from the GitLab UI—no API calls required. Navigate to your group's **Settings > Packages and registries > Virtual Registry** to:\n\n* Create and manage virtual registries\n* Add, edit, and reorder upstream registries\n* View and manage the cache\n* Monitor which images are being pulled\n\n## What's next\n\nWe're actively developing:\n\n* **Allow/deny lists**: Use regex to control which images can be pulled from specific upstreams.\n\nThis is beta software. It works, people are using it in production, but we're still iterating based on feedback.\n\n## Share your feedback\n\nIf you're a platform engineer dealing with container registry sprawl, I'd like to understand your setup:\n\n* How many upstream registries are you managing?\n* What's your biggest pain point with the current state?\n* Would something like this help, and if not, what's missing?\n\nPlease share your experiences in the [Container Virtual Registry feedback issue](https://gitlab.com/gitlab-org/gitlab/-/work_items/589630).\n## Related resources\n- [New GitLab metrics and registry features help reduce CI/CD bottlenecks](https://about.gitlab.com/blog/new-gitlab-metrics-and-registry-features-help-reduce-ci-cd-bottlenecks/#container-virtual-registry)\n- [Container Virtual Registry documentation](https://docs.gitlab.com/user/packages/virtual_registry/container/)\n- [Container Virtual Registry API](https://docs.gitlab.com/api/container_virtual_registries/)",[723,722,737],{"featured":14,"template":15,"slug":751},"using-gitlab-container-virtual-registry-with-docker-hardened-images",{"promotions":753},[754,768,779,791],{"id":755,"categories":756,"header":758,"text":759,"button":760,"image":765},"ai-modernization",[757],"ai-ml","Is AI achieving its promise at scale?","Quiz will take 5 minutes or less",{"text":761,"config":762},"Get your AI maturity score",{"href":763,"dataGaName":764,"dataGaLocation":238},"/assessments/ai-modernization-assessment/","modernization assessment",{"config":766},{"src":767},"https://res.cloudinary.com/about-gitlab-com/image/upload/v1772138786/qix0m7kwnd8x2fh1zq49.png",{"id":769,"categories":770,"header":771,"text":759,"button":772,"image":776},"devops-modernization",[722,565],"Are you just managing tools or shipping innovation?",{"text":773,"config":774},"Get your DevOps maturity score",{"href":775,"dataGaName":764,"dataGaLocation":238},"/assessments/devops-modernization-assessment/",{"config":777},{"src":778},"https://res.cloudinary.com/about-gitlab-com/image/upload/v1772138785/eg818fmakweyuznttgid.png",{"id":780,"categories":781,"header":783,"text":759,"button":784,"image":788},"security-modernization",[782],"security","Are you trading speed for security?",{"text":785,"config":786},"Get your security maturity score",{"href":787,"dataGaName":764,"dataGaLocation":238},"/assessments/security-modernization-assessment/",{"config":789},{"src":790},"https://res.cloudinary.com/about-gitlab-com/image/upload/v1772138786/p4pbqd9nnjejg5ds6mdk.png",{"id":792,"paths":793,"header":796,"text":797,"button":798,"image":803},"github-azure-migration",[794,795],"migration-from-azure-devops-to-gitlab","integrating-azure-devops-scm-and-gitlab","Is your team ready for GitHub's Azure move?","GitHub is already rebuilding around Azure. Find out what it means for you.",{"text":799,"config":800},"See how GitLab compares to GitHub",{"href":801,"dataGaName":802,"dataGaLocation":238},"/compare/gitlab-vs-github/github-azure-migration/","github azure migration",{"config":804},{"src":778},{"header":806,"blurb":807,"button":808,"secondaryButton":813},"Start building faster today","See what your team can do with the intelligent orchestration platform for DevSecOps.\n",{"text":809,"config":810},"Get your free trial",{"href":811,"dataGaName":45,"dataGaLocation":812},"https://gitlab.com/-/trial_registrations/new?glm_content=default-saas-trial&glm_source=about.gitlab.com/","feature",{"text":501,"config":814},{"href":49,"dataGaName":50,"dataGaLocation":812},1777493620998]