Random notes about git

This post was written by eli on September 7, 2011
Posted Under: Linux,Software

Merely for myself, so I’ll remember how to do it. If I’m doing stupid things, please comment below.

Rule #1

If you’re about to do anything with git being unsure about the consequences, always protect yourself with

$ git branch bettersafe
$ git commit -a
$ git checkout whatever

Assuming that you want to mess on the branch “whatever”

No matter what happens next, you can always return things to where they were by checking out bettersafe. Really. As long as you don’t mess up the bettersafe branch, of course. That is, don’t mention it in any subsequent command, and git won’t touch it. Don’t delete it, don’t use it in merge or rebase commands. Only check it out if things go wrong.

Otherwise, you can rebase, mess around, delete files and squash commits. No commits or other data will be lost, because bettersafe depends on them.

Delete bettersafe at some later time, of course. When you’re sure everything turned out OK.

To keep in mind

  • Git manages changes, not versions. Think of a commit as a patch, not a snapshot. Even though a commit’s ID happens to be a representation of the project’s current snapshot.
  • A branch is just a pointer to some commit’s ID (possibly base) saying “my next commit will be based on this commit” (the latter is the branch’s HEAD).
  • Being on a branch controls where your next commit will go
  • Checking out a branch is like checking out the root commit, and running all commits (think patches) up to that branch’s HEAD
  • Hence rebasing actually means moving the branch’s first commit’s parent ID.

Backing up and restoring the whole repo

To backup the entire repository:

$ git bundle create mybundle.git --all

This creates the mybundle.git file, which is an efficient storage of the entire repository.

To restore all branches and tags, the following sequence applies:

$ git clone mybundle.git myclone
$ cd myclone/
$ for i in `git bundle list-heads mybundle.git` ; do git checkout ${i##*/} ; done
$ git checkout master

The for-loop produces as lot of output, and eventually leaves the working tree on some random ref. So the final checkout gets you where you want to start, presumably “master” in the example above.

I will be most delighted to know if there exists a single command doing this. As it seems, just “git clone” copies only the commits belonging to the current branch (i.e. HEAD), and any refs that were on its way. This probably makes sense when cloning from a remote repository with gazillions of commits. The concept of a local backup was probably neglected, because of the way git is usually used (that is, in collaboration).

For a partial backup of a certain sequence of commits, say from “master” to “playaround” go

$ git bundle create mybundle.git master..playaround

To use this bundle, “git pull” must be used on the target repository to get the commits. Needless to say, it must already have the commit pointed to by “master” when the bundle was made, or the entire bundle is completely worthless.

Oops. I messed up.

Git doesn’t delete anything except for during periodical (or initiated) garbage collections, so if something was committed in the past, there’s a change it’s still there, only invisible. The trick is to use

$ git reflog

and if the lost commit is found, tag it with

$ git tag rescue_me 7cb12d6

Tagging it makes the commit (and those leading to it) visible in gitk (given that it’s set to show all refs) and also prevents their removal on the next garbage collection.

Git sequence for making a patch for submission

See Documentation/SubmittingPatches in the Linux tree.

First, clone the main git that appears in the MAINTAINERS part for the subsystem the patch is going to. Probably better, add the repo to the existing main Linux repository, initially fetched with

$ git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

Remember: The patch must apply (as in “git am”) against the “next” branch of the relevant subsystem’s repository.

Make the changes, verify that the latest version of sparse doesn’t complain with e.g.

$ make M=drivers/staging/mydriver C=2

The C=2 forces as sparse run even on files that need no compilation. C=1 only checks modified files.

Always grab the latest version of sparse and check against the latest possible kernel. Really. I really had new warnings popping up just by checking against a newer kernel.

When all is well, commit. If the patch is a response to a bug report, add a “Reported-by:” header at the end of the commit message manually. Same goes for Tested-by, Suggested-by and Reviewed-by. There is no automatic mechanism for this (like –signoff)

Also: Always have something meaningful written in the commit description. Just a title isn’t good enough (Greg rejected a very short patch of mine on the grounds that it had only a subject and no “changelog information”).

$ git commit -a
$ git format-patch master -o patch --signoff

This will create a directory called “patch” and put the patch there. If several commits are made, a different patch file is created for each. That what the commit -a is for.

To get all commits in one patch, use the –to-stdout flag, and of course redirect stdout to a file.

If this is a second submission (version 2 = v2), use

$ git format-patch master -o patch --signoff --subject-prefix="PATCH v2"

It’s also a desired to add a short description on the difference from earlier versions. The place to do it, is under the “—” mark of the signoff. This can be done manually, or with git notes (see below) for creating a note for a commit, and then use the –notes flag with git format-patch. The result in the patch file is something like this:

Signed-off-by: Eli Billauer <eli.billauer@example.com>
---

Notes:
 This is a really horrible patch

Check the patch with (run from kernel tree’s root)

$ scripts/checkpatch.pl --strict patch/0001-name-of-patch

(note that checkpatch.pl has a -f flag, allowing to check a source file rather than a patch)

I marked the –strict flag red, because I forgot to use it in an important occasion. It’s turned on automatically, by the way, when checkpatch detects that the target directory is one of drivers/net, net/ or drivers/staging/. Yes, the directories are hardcoded in the script itself. Perl or not?

Whitespace cleanup:

$ scripts/cleanpatch patch/0001-name-of-patch

Send it through email (the way the kernel guys like it)

git send-email --to 'linux-kernel@vger.kernel.org' patch/0001-name-of-patch

Add a –cc flag to send a copy to someone else than yourself. The mail won’t kick off before confirming it, so don’t worry…

Now, since I’m using my Gmail address as the “From”, the mail must come from a Gmail mail server. In order to relay the mail through their servers (and not my own), there’s a whole story about that. See this post.

git notes

This is a neat mechanism for adding extra information to commits. Its main use is for change log information to patches, so they appear in the email but not in the commit (if and when it’s applied).

To make thing simple, this is the only command needed to maintain notes:

$ git notes edit

It adds and/or edits the notes for the current commit with the selected editor.

As of gitk that goes along with 2.17.1 (2016 edition? It doesn’t say its version), the notes appear at the commit view, below the commit title, after a hard update (Ctrl-Shift-F5). There are also small yellow boxes next to the commit title in the tree view to mark that there are notes related to the commit.

Following the suggestion on this page, add the following lines to .git/config of the relevant repository.

[notes "rewrite"]
    amend = true
    rebase = true
[notes]
    rewriteRef = refs/notes/commits

The problem this solves: Notes refer to commits by their object ID, which changes when the commit is rebased or amended. As a result, the note becomes detached from the commit it related to. This chunk tells git to update the notes’ refs to follow the commits.

It’s also possible to add it to ~/.gitconfig, but if rebasing or commit amending is done on a repository that doesn’t have any notes, one gets a “warning: notes ref refs/notes/commits is invalid”.

Applying a Linux patch into a local project

The path of the files in the Linux project is deeper, so the application is with something like

$ git am -p4 patchdir/0001-this-is-the.patch

or the other way around: Applying a local project’s patch into the Linux kernel:

$ git am --directory drivers/mypath/mydriver/ 0001-this-is-the.patch

Applying a dirty patch

If the patch was made not to be applied directly (e.g. apply changes made in one file to another, by editing the patch file) format-patch can be used to generate the patch, and then go

$ git apply --reject --whitespace=fix patchdir/0001-This_is_the_patch

I think it’s best to remove the header containing the commit ID from which the patch was originally made, but haven’t really tried not doing this.

Preventing diff from working on a binary file

In particular, gitk has a tendency to try diffing *.ngc files and therefore freezing, since they start with text and then go blob. Create (or edit) .gitattributes in the project’s root directory (man gitattributes) to say

*.ngc binary

(as usual with git, the solution is painfully simple) and it probably makes sense to commit this file into the project as well.

Messing around with commits

For a graphical representation of the branches and commits, change directory to where the git commands are run from and simply go

$ gitk &

To see a the complete tree of local branches (recommended) go to View > New View and check “All (local) branches”.

Add files to be watched by git (and hence relevant on the next commit)

$ git add filename filename2 ...

This can also be done as one step of a commit command. Committing all watched files:

$ git commit -a

Oops…? Made some changes which would fit best in the last commit? Want to fix the last commit, and re-edit the commit message? Easy, just go

$ git commit -a --amend

Note that this changes the commit message as well as updating the commit according to the working tree.

And change the date of the commit to “now”:

$ git commit --date "`date -R`" --amend

… but that changes only the authoring date. To set the committing date to the same, go

$ git rebase --committer-date-is-author-date HEAD~1

And of course, the latter command can go deeper into history, depending on what’s instead of “HEAD~1″.

For setting the date to the current one on several commits, use e.g.

$ git rebase --ignore-date origin/master

To commit only part of the changes:

$ git add --all -p
$ git commit

To use an existing commit’s log message (in particular if it’s cherry-picked or even orphaned):

$ git commit -c 77ddd228c8cc26801eb83f421048d30fa1c31564

To move the current branch’s head, so it doesn’t include the latest commit (but leave the changes in the sources), a.k.a. “remove the commit”:

$ git reset HEAD~1

Note that the commit stays in the repository until garbage collected, and its changes remain in the worktree. This actually says “move the branch one step (as in HEAD~1) back. A “git checkout -f” will remove the commit’s effect on the worktree as well.

To really go back one commit (that is, revert its changes in the working tree),

$ git reset --hard HEAD~1

Even though the commit stays in the repository until garbage collected, that may happen sooner than desired. So if the commit may be useful in the future, create a new branch (e.g. “delme”) before backing off the current one, so the commit isn’t lost.  If this is done by mistake, just cherry-pick the reverted commit with gitk (hoping it’s still there).

And check the latest commits with

$ git log

Rebasing your own experiments (branch “foolaround”) on top of the master branch (so your games are on the real thing + your changes)

$ git rebase master foolaround

Note that the third argument, “foolaround” tells git to check out this branch first, and then rebase it to master. It’s otherwise assumed that the current branch is rebased, so this format is somewhat safer. But “git rebase master” is fine as well (and rebasing “master” on itself just says that the branch is up to date, which happens to be always be true).

Move some commit to the top of some other branch, say to master:

$ git rebase 77ddd228c8cc26801eb83f421048d30fa1c31564 master

where that blob in the middle is the commit ID, of course. Unlike “git cherry-pick” which applies the changes only, without changing the tree of commits)

To fool around with the 4 last commits (reorder, remove, squash several commits into one:

$ git rebase -i HEAD~4

Note that when squashing, the commit marked to squash is mixed with the commit one the row above in the list (which was committed later in time). An opportunity to edit the commit message will be given anyhow, so just mark the commits for squashing and go on with it. Of course, several commits can be marked in a row for a multi-squash.

What have I changed since the last commit?

$ git diff

What is the difference between now and some branch?

$ git diff somebranch

What is the commit ID of a given tag, branch or other commit one can refer to (HEAD in this example)?

$ git rev-parse HEAD

What’s the last tag issued in the current branch?

$ git describe --tags --abbrev=0

To check for another branch, add its name as the last argument for the same command above.

Apply changes as if checking out another branch, but stay in place:

$ git checkout -p thatbranch

Each hunk (that is, piece to change) is prompted for. Say “y” to all, and the working tree will be like “thatbranch”. Say “n” to all, and nothing changes. This way or another, you don’t switch branch, only the files changes. This is, in fact, not a checkout.

Find the directory where .git/ sits:

$ git rev-parse --show-toplevel

Untracking files already in index

This recipe works when it’s OK to temporarily remove the files from the worktree.

First, edit the .gitignore so that the relevant files will be ignore in the future. Then commit .gitignore and create a tag on the last commit, say “hold”. Make sure that the files are indeed tracked:

$ git ls-files | less

Now remove the files from the index, possibly with several commands e.g.:

$ git rm --cached useless-file

This changes nothing in the working tree for now, but only marks the files as gone in the index.

And then commit. The files will be removed in the working tree.

$ git commit

To have the files back, patch-checkout the previous commit. HEAD~1 would work instead of “hold”, but the purpose of hold was also to be an anchor for messups:

$ git checkout -p hold

When applying the relevant hunks, git asks as if these will be applied to the index as well, but in fact they don’t go to the index because of the updated .gitignore. So when the process is finished, git remains in sync (unlike it would in a normal checkout -p session).

If all is well, the “hold” tag can be removed.

To get a list of branches so a script can handle the output (as opposed to just “git branch” which isn’t safe):

$ git for-each-ref --format='%(refname:short)' refs/heads/

Cleaning up the directory tree

Use with caution, and think twice before doing this: It deletes all files not tracked in the repo. All those nice private scripts and stuff? Gone.

To clean untracked files (those appearing in “git status”) go first

$ git clean -n

and see what the damage is, and then remove the -n if being sure. To delete all files, regardless of .gitignore (this is “make mrproper” just a little more aggressive):

$ git clean -n -d -fx

Windows: Remove git from Explorer right-click menus

Uninstalling git involves closing explorer.exe (!) and didn’t work even so on my computer, so I took the registry editing route.

Basically removed the HKEY_CLASSES_ROOT\*\shellex\ContextMenuHandlers\Git-Cheetah, HKEY_CLASSES_ROOT\Directory\Background\shellex\ContextMenuHandlers\Git-Cheetah, HKEY_CLASSES_ROOT\Drive\shellex\ContextMenuHandlers\Git-Cheetah registry keys and other places where I found Git-Cheetah under ContextMenuHandlers.

Also removed anywhere the string {ca586c80-7c84-4b88-8537-726724df6929} appeared under something having to do with shell extensions. A bit scary, but harmless.

Setting up my own little git server

A git server is just like any repository, only the data is kept “bare”, that is without the working tree, and with the files usually in the .git subdirectory residing directly on the repo’s root. A simple “git clone” with the repo’s root yields the full git repository, in case of doubt. The gitk utility can also be run from the repo’s root.

First thing first:

# yum install git-daemon

Among others, this adds /etc/xinetd.d/git, which is disabled by default.

Edit the file to read (changed parts in red):

# default: off
# description: The git dæmon allows git repositories to be exported using \
#       the git:// protocol.

service git
{
 disable         = no
 socket_type     = stream
 wait            = no
 only_from       = 10.1.1.0/24 127.0.0.1
 user            = git
 server          = /usr/libexec/git-core/git-daemon
 server_args     = --base-path=/home/git/public_git --export-all --syslog --inetd --verbose --enable=upload-pack --enable=receive-pack
 log_on_failure  += USERID
}

Create a new user “git”, create public_git in its home directory, and then go

# service xinetd restart

And setup a new repo, e.g. (as user “git” in its home directory):

$ mkdir test
$ cd test/
$ git --bare init

And then, as any user, bind the origin. This is not necessary if the repository is cloned from the origin anyhow.

$ git remote rm origin
$ git remote add origin git://localhost/test/

And possibly push and fetch (I don’t like pulling, because the merging can fail). See my note about a problem with “git push” on Windows.

$ git push --all
$ git fetch --all

Note that the the new commit will not necessarily appear after fetching, unless the display mode is set to “all refs”.

To push a certain branch to the remote repo, despite “push –all” failing because some history will be killed, go

$ git push -n origin +master
$ git push origin +master

for the pushing the master branch by force (the -n option is for a dry-run). This is OK in particular after some commit munching on a branch, which no other development entity depends on. Or the other side will have to cherry-pick its latest commits. Or move its local “master” branch to the new one, and rebase from there (losing tags).

To synchronize with the remote repository, either merge or rebase. The remote branch is called origin/branch, e.g.

$ git rebase origin/master

rebases the local changes on the remote ones on the master branch.

To push all tags to the remote server, go

$ git push --tags

but remember not to do this when there are temporary tags (because the only way to get rid of them is from the server).

To turn the remote into an exact copy of the local repo, use

$ git push --mirror

with care: No questions are asked. The remote repo is just overridden. Whatever is gone on the local repo will be gone in the remote one. This is important in particular if commits have been squashed etc. Try “push –all” first. If that fails, it’s likely that the other copies of the repository will have to be re-cloned. Which is fairly OK if they’ve all pushed –all and the current repo has at least fetched –all.

Always make sure there is a “master” branch when pushing. Otherwise attempts to clone will result in “remote HEAD refers to nonexistent ref, unable to checkout” and a useless local copy. This is because HEAD points at “master” by default.

Remote repos

$ git branch --set-upstream localbranch theorigin/master

This makes the local branch track the “master” branch on some remote repository. Unfortunately, for pushing, one still has to be explicit:

$ git push theorigin localbranch:master

This makes git treat “localbranch” as the remote’s “master”, and it’s necessary despite the upstream setting above.

Uploading to github

Definitely, go for ssh authentication. It requires copying the content of ~/.ssh/id_rsa.pub into a text window under Github’s web interface: Click on the icon at the top right, choose Settings and then select “SSH and GPG keys”. Then be sure to define git’s remote as “git@github.com:billauer/theproject.git” and not the https: thing. And from this point on, it’s just like password-less ssh. That is, forget about authentication.

Ah, but it’s not that simple if you have multiple accounts on Github. Because if an ssh key is already used for another user, it’s rejected when trying to apply it on a new one, with the error message “Key is already in use”. The solution is to maintain multiple keys. This is mildly annoying once you get the hang of how to do that. See section named “Having multiple SSH keys” in another post of mine. The trick is to invent a name for a host, say, “sillygit”. Set git’s remote as “sillygit:billauer/theproject.git”. Then set SSH’s config file to recognize “sillygit” as the name of a host, and apply a set of parameters: The real host name, the user name and the SSH key to use.

Don’t confuse this Deploy keys, which are a per-repository SSH keys. These keys allow access to a specific repository only.

Access to multiple users

If the repository should be accessible to multiple users that belong to a group, add this to the config file:

sharedRepository = true

See git’s doc on configuration parameters.

Add a Comment

required, use real name
required, will not be published
optional, your blog address