This document describes the process to update a Kubernetes Kustomize or Kubernetes Legacy Sourcegraph instance. If you are unfamiliar with Sourcegraph versioning or releases see our general concepts documentation.
please see our cautionary note on upgrades, if you have any concerns about running a multiversion upgrade, please reach out to us at support@sourcegraph.com for advisement.
Standard upgrades
A standard upgrade occurs between a Sourcegraph version and the minor or major version released immediately after it. If you would like to jump forward several versions, you must perform a multi-version upgrade instead.
Upgrade with Kubernetes Kustomize
The following procedure is for performing a standard upgrade with Sourcegraph instances version v4.5.0 and above, which have migrated to the deploy-sourcegraph-k8s repo.
Step 1: Create a backup copy of the deployment configuration file
Make a duplicate of the current cluster.yaml deployment configuration file that was used to deploy the current Sourcegraph instance.
If the Sourcegraph upgrade fails, you can redeploy using the current cluster.yaml file to roll back and restore the instance to its previous state before the failed upgrade.
Step 2: Merge the new version of Sourcegraph into your release branch.
SH
cd $DEPLOY_SOURCEGRAPH_FORK# get updatesgit fetch upstream# to merge the upstream release tag into your release branch.git checkout release# Choose which version you want to deploy from https://github.com/sourcegraph/deploy-sourcegraph-k8s/tagsgit merge $NEW_VERSION
Step 3: Build new manifests with Kustomize
Generate a new set of manifests locally using your current overlay instances/$INSTANCE_NAME (e.g. INSTANCE_NAME=my-sourcegraph) without applying to the cluster.
Step 5: Monitor the status of the deployment to determine its success.
SH
$ kubectl get pods -o wide --watch
Upgrade with Legacy Kubernetes
The following procedure is for performing a standard upgrade with Sourcegraph instances in versions prior to v4.5.0, or which have notmigrated and still use deploy-sourcegraph.
Step 1: Merge the new version of Sourcegraph into your release branch.
SH
cd $DEPLOY_SOURCEGRAPH_FORK# get updatesgit fetch upstream# to merge the upstream release tag into your release branch.git checkout release# Choose which version you want to deploy from https://github.com/sourcegraph/deploy-sourcegraph/tagsgit merge $NEW_VERSION
Step 2: Update your install script kubectl-apply-all.sh
If you have specific commands that should be run whenever you apply your manifests, you should modify this script accordingly.
For example, if you use overlays to make changes to the manifests, you should modify this script to apply the manifests from the generated cluster directory instead.
Step 3: Apply the updates to your cluster.
SH
$ ./kubectl-apply-all.sh
Step 4: Monitor the status of the deployment to determine its success.
SH
$ kubectl get pods -o wide --watch
Multi-version upgrades
⚠️ Attention: please see our cautionary note on upgrades, if you have any concerns about running a multiversion upgrade, please reach out to us at support@sourcegraph.com for advisement.
To perform a multi-version upgrade via migrators upgrade command on a Sourcegraph instance deployed with our kustomize repo follow the procedure below:
Check Upgrade Readiness:
Check the upgrade notes for the version range you're passing through.
Follow the standard legacy upgrade procedure to pull and merge upstream changes from the version you are upgrading to to your release branch.
Update cluster.yaml and scale down non-database deployments and replicas:
In your cluster kustomization file (instances/my-sourcegraph/kustomize.yaml), uncomment the multi-version-upgrade util. This will scale down all non-database deployments and statefulSets replicas to 0.
YAML
# - ../../components/utils/uid # -- Run all Postgres database with valid users on host - ../../components/utils/multi-version-upgrade # -- Scale down non-database pods to 0 for multi-version upgrade # - ../../components/utils/migrate-to-nonprivileged # -- Component for migrating from privileged to non-privileged #
Note: This step will ensure that any PostgreSQL upgrade performed as an entrypoint script will have a chance to execute before the migrator is run. For more information see Upgradeing PostgreSQL.
Run Migrator with the upgrade command:
For more detailed instructions and available command flags see our migrator docs.
In the configure/migrator/migrator.Job.yamlmanifest:
set the image: to the latest release of migrator
set the args: value to upgrade. Example:
YAML
- name: migrator image: "index.docker.io/sourcegraph/migrator:6.0.0" // here we use a later version of migrator than the version we're upgrading too. args: ["upgrade", "--from=v5.9.0", "--to=5.11.0"] env:
Note:
Always use the latest image version of migrator for migrator commands, except the startup command up
You may add the --dry-run flag to the command: to test things out before altering the dbs
Run the following commands to schedule the migrator job with the upgrade command and monitor its progress:
SH
# To ensure no previous job invocations will conflict with our current invocationkubectl delete job migrator# Start the migrator jobkubectl apply -f components/utils/migrator/resources/migrator.Job.yaml# Stream the migrator's stdout logs for progresskubectl logs job/migrator -f
Example:
SH
$ kubectl -n sourcegraph apply -f ./migrator.Job.yamljob.batch/migrator created$ kubectl -n sourcegraph get jobsNAME COMPLETIONS DURATION AGEmigrator 0/1 9s 9s$ kubectl -n sourcegraph logs job/migrator❗️ An error was returned when detecting the terminal size and capabilities: GetWinsize: inappropriate ioctl for device Execution will continue, but please report this, along with your operating system, terminal, and any other details, to: https://sourcegraph.com/contact✱ Sourcegraph migrator 4.5.1👉 Migrating to v3.43 (step 1 of 3)👉 Running schema migrations✅ Schema migrations complete👉 Running out of band migrations [1 2 4 5 7 13 14 15 16]✅ Out of band migrations complete👉 Migrating to v4.3 (step 2 of 3)👉 Running schema migrations✅ Schema migrations complete👉 Running out of band migrations [17 18]✅ Out of band migrations complete👉 Migrating to v4.5 (step 3 of 3)👉 Running schema migrations✅ Schema migrations complete$ kubectl -n sourcegraph get jobsNAME COMPLETIONS DURATION AGEmigrator 1/1 9s 35s
Generate and apply a new cluster.yaml file:
Comment out the multi-version-upgrade util in your cluster kustomization file (instances/my-sourcegraph/kustomize.yaml).
YAML
# - ../../components/utils/uid # -- Run all Postgres database with valid users on host# - ../../components/utils/multi-version-upgrade # -- Scale down non-database pods to 0 for multi-version upgrade# - ../../components/utils/migrate-to-nonprivileged # -- Component for migrating from privileged to non-privileged
Generate and apply a new cluster.yaml with the new images from Step 2.
Your instance should now be up and running in the new version!
Using MVU to Migrate to Kustomize
Due to limitations with the Kustomize deployment method introduced in Sourcegraph v4.5.0, multi-version upgrades (e.g. v4.2.0 -> v5.0.3), migrations to deploy-sourcegraph-k8s should be conducted seperately from a full upgrade.
Admins upgrading a Sourcegraph instance older than v4.5.0 and migrating from our legacy kubernetes offering to our new kustomize manifests should upgrade to v4.5.0 perform the migrateprocedure and then perfom the remaining upgrade to bring Sourcegraph up to the desired version.
Rollback
Rollback
You can rollback by resetting your release branch to the old state before redeploying the instance.
If you are rolling back more than a single version, then you must also rollback your database, as database migrations (which may have run at some point during the upgrade) are guaranteed to be compatible with one previous minor version.
For rolling back a multiversion upgrade use the migratordowngrade command. Learn mor in our downgrade docs.
Database migrations
In some situations, administrators may wish to migrate their databases before upgrading the rest of the system to reduce downtime. Sourcegraph guarantees database backward compatibility to the most recent minor point release so the database can safely be upgraded before the application code.
To execute the database migrations independently, follow the Kubernetes instructions on how to manually run database migrations. Running the up (default) command on the migrator of the version you are upgrading to will apply all migrations required by the next version of Sourcegraph.
Improving update reliability and latency with node selectors
Some of the services that comprise Sourcegraph require more resources than others, especially if the
default CPU or memory allocations have been overridden. During an update when many services restart,
you may observe that the more resource-hungry pods (e.g., gitserver, indexed-search) fail to
restart, because no single node has enough available CPU or memory to accommodate them. This may be
especially true if the cluster is heterogeneous (i.e., not all nodes have the same amount of
CPU/memory).
If this happens, do the following:
Use kubectl drain $NODE to drain a node of existing pods, so it has enough allocation for the larger
service.
Run watch kubectl get pods -o wide and wait until the node has been drained. Run kubectl get pods to check that all pods except for the resource-hungry one(s) have been assigned to a node.
Run kubectl uncordon $NODE to enable the larger pod(s) to be scheduled on the drained node.
Note that the need to run the above steps can be prevented altogether with node
selectors, which
tell Kubernetes to assign certain pods to specific nodes.
High-availability updates
Sourcegraph is designed to be a high-availability (HA) service, but upgrades by default require a 10m downtime
window. If you need zero-downtime upgrades, please contact us. Services employ health checks to test the health
of newly updated components before switching live traffic over to them by default. HA-enabling features include
the following:
Replication: nearly all of the critical services within Sourcegraph are replicated. If a single instance of a
service fails, that instance is restarted and removed from operation until it comes online again.
Updates are applied in a rolling fashion to each service such that a subset of instances are updated first while
traffic continues to flow to the old instances. Once the health check determines the set of new instances is
healthy, traffic is directed to the new set and the old set is terminated. By default, some database operations
may fail during this time as migrations occur so a scheduled 10m downtime window is required.
Each service includes a health check that detects whether the service is in a healthy state. This check is specific to
the service. These are used to check the health of new instances after an update and during regular operation to
determine if an instance goes down.
Database migrations are handled automatically on update when they are necessary.