Last week Shamus Young has decided to open source his excellent proof of concept, procedural world generation project codenamed “Frontier”. This is actually quite exciting as there is a possibility that someone will clean it up, and manage to tweak it into something with actual game-play. This is not what I wanted to talk about though – I wanted to talk about what happened immediately before this event.
Shamus asked the community where they would like the project to be hosted. The answer was a no-brainer – everyone in the comments agreed that Gitghub was beyond a shadow of doubt the best place to put it. And so, Shamus downloaded git and attempted to get the project under source control and out on the web. And then this happened:
Created an account. Created a repository. Installed Git locally. Followed the directions to set up git locally, which includes typing stuff into a Linux shell, which is trivial if you know what you’re doing and utterly, utterly mysterious if you don’t. Created ssh key. Set up a local repository. Added files meticulously one at a time from a list of hundreds of files because the Git GUI just lists all files and I don’t see how to filter for JUST source files. I hit commit and… nothing showed up.
His mounting frustration very apparent, and I know exactly what he is experiencing. This was more or less my first encounter with git too. It is not straightforward. Github is not helping either, because it does not tell you that you can bypass the ssh key requirements by simply using https:// instead of git:// when setting up the remote repository. If you plan to be using git and github a lot, then generating and uploading the ssh key is definitely a good idea. But for a one-off project like this, it is a lot of unnecessary hassle and a significant hurdle for new users to overcome.
There is another significant issue at play here – Shamus, like many users new to distributed source control sees familiar words like “commit” and assumes they work the same as in centralized repositories. Naturally they do not, but it is sometimes difficult to ascertain that by glancing at the official docs. Because, you know that’s exactly what we all do when trying to get a new tool working. We glance at the docs, we we use them at all.
I remember my own confusion – when I was starting with Git, I wanted a quick and easy, one sentence explanation of what the fuck is the distributed thing all about, and how will it make my life difficult.
This was supposed to be a quick & easy thing, and I’m now 40 mins in, I’ve got Git infrastructure spewed all over my computer and I can’t get it to do this very simple thing. I’ve used source code control before, and it was always pretty straightforward. Even thirteen years ago, I never had to type crap into a console window to perform simple tasks. Is Git only for people who understand Linux? (The front end is all friendly and Windows-like, which is what led me to believe I’d be able to do this. If it started with a console window I would have realized this was for someone with a different skill set and looked elsewhere.)
I can relate to this too. When I first installed Git on Windows I tried to use the default GUI. I say tried, because I have never actually managed to get anywhere with it. The UI is so obtuse, confusing and convoluted I could not wrap my head around it. Forget user-friendly – Git GUI is downright user-hostile.
Fortunately, I discovered Tortoise Git which works more or less the same as the SVN equivalent. So I was able to hobble along and sort-of use git (but not really) until I realized that it is much easier to use the command line version. Shamus was not so lucky – the UI defeated him. This was a complete usability failure.
Granted, part of the problem lies on the side of the user. Git was a tool made by Linux geeks, to solve an array of very complex issues involved in massive collaboration projects such as Linux kernel development. It was not made to be user friendly, forgiving and nice. It is a tool deeply rooted in the Unix philosophy and designed to be what a high end power tool is for a craftsman – a precise, powerful and flexible instrument that nevertheless requires some skill to use.
The lack of training wheels is more or less by design. You would not put training wheels and streamers on a racing bicycle, would you? It would defeat the purpose. For a tool such as git, the user is expected to put in some time and effort up-front to understand the tool, and then recoup that time in productivity later.
I have no idea what Git wants or how it works. I don’t see ANYTHING that tells me how to push changes to the remote repository. If doing simple things like “submit changes” means using a terminal window, then… damn. What year is it? I know you Linux coders have a high tolerance for this sort of thing, but damn – there are better ways of using a computer these days. Case in point: If I had a menu, I would be able to work this out for myself.
I have tremendous amount of respect for Shamus. His projects such as the Procedural City or Frontier are a standing body of evidence that he is a highly skilled, apt and capable individual. He failed not because of ignorance, but because he has mistaken a steep learning curve for bad design and called it quits early on. This is usually a good strategy – investing time and effort in learning a tedious, broken tool makes little or no sense. You spend a lot of time learning it up front, and then a lot of time fighting with it every time you need it to do something. It is a waste.
But, git is actually well designed and finely tuned tool. The crudeness of the Windows GUI is not really a problem with Git itself, but with the implementation of the Windows port. It does not condemn Git or Github. Unfortunately Shamus failed to look past that.
I agree that the default GUI in MySysGit is really atrocious. It also doesn’t help that a lot of official Git and Github documentation is overly dense and needlessly complex. Here is my attempt at an easy from zero to github primer for someone who just wants to set up a repository, push their project online and forget about it.
Why Command Line
Believe it or not, the command line is the most efficient way to communicate with your computer. It is the closest thing we have to actually talking with the machine, in the way that is exact and precise (and not fuzzy pattern matching like Siri does with speech).
When you are working on a command line, you issue precise directives, whereas a GUI is like a big menu board from which you pick your options. That board has to be designed for each application, and feature every possible option, for every circumstances. If the GUI designer hides advanced options for the sake of simplicity, power users will loose productivity clicking extra things or opening extra dialogs impacting their productivity. If they show too many options, the interface might become to dense and to difficult to navigate and use efficiently. Command line programs usually have sane defaults and then use arguments for extra stuff. So if you need more options, you just type more – and you only include as many as you need. You work at your own comfort level.
With Git, you will need to memorize about five basic commands which will cover 90% of the things you will be doing on a daily basis. The remaining 10% comes into play when you want better organization, when you mess up, or when you try to merge things that were not designed to be merged. In those situations things get hairy and ugly whether you use a GUI or CLI so it really makes no difference. In my experience GUI’s tend to exacerbate and obfuscate these sort of issues more than they help.
If you don’t believe me, go download Tortoise Git and use that. It’s much better than the default GUI of MySysGit. But keep in mind it is a crutch.
How Git Works
If you have ever used a centralized source control such as Subversion, you are probably used to something like this:
Everyone shares the same central remote repository. People check out code, modify it on their computer, an then check it back in. When you are working alone, this usually works quite well. If you work with bunch of other people, the code they check in, may conflict with your changes. When that happens things get hairy and you (or the project maintainer) will need to massage the code from both sides to make it fit before it can be committed and saved.
This is a big issue on large open source projects where a ugly merge can prevent everyone from checking code in. While it can be alleviated to a degree by use of tags and branches it never really goes away.
Git (as well as Mercurial and other distributed source control tools) were designed to resolve this issue by designing the system like this:
This is sort of how Github works. Everyone has their own public repository on Github. If you want to collaborate on a project, you “fork” it and you get your own, personal copy. But you don’t usually check code into the online public repository like in centralized systems.
Instead you make another private repository locally on your computer which is an exact clone of the public GitHub one. Then you modify and check in your code into that one. Then at the end of the day you can sync the changes to your public one by issuing a “push” command. Your work does not touch the original repository that you forked. But the owner of that repository can “pull” in your changes at any time. Then it is up to them to deal with merge disasters.
This is the beauty of distributed source control – you never need to worry about conflicts, unless you want to, and said conflicts will never prevent other people from working.
From Zero to Github in 5 Steps
Earlier I mentioned that there are 5 git commands you need to learn to get your project onto Github. I was not joking. Let me put my money where my mouth is and prove this:
Step 1: Create a Local Repository
You start by opening the shell and navigating to your project folder. On Windows you can just right-click on that folder and choose “Git Bash Here”. Then you type in:
Boom, now your folder is a git repository. Yes, it’s that easy.
Step 2: Add Files
Next thing is to tell git what files you want to be committed:
git add .
This will put all the files in the folder under source control. You can also add files one at a time (by replacing the dot with a file name) or use wildcards (eg. *.cpp, or *.py).
Step 3: Commit Added Files
Now we commit our changes into the local repository:
git commit -m "First commit"
Note that Nothing goes to Github yet. Your changes can’t be seen by others. But you just made a snapshot of your code. You can easily roll back to this state at a later point and etc.
Step 4: Add a Remote Repository
This is the point where you go to Github and create yourself a project. then you do this:
git remote add origin https://GitHubUsername@github.com/GithubUsername/Your-Project-Name.git
This basically telling git “Yo, git – I want you to become aware of a remote repository on Github, which I will from now on refer to as ‘origin’ and it’s located at this address”. You notice I used the https rather than git: or ssh: address. Why? Because this is easier – you don’t have to mess around with keys this way. Git will simply ask you for a password when you “push” to github.
Note that I used https:// and not git:// like Github recommends. Why? Because this lets you bypass all the ssh key steps. What ssh key steps? Don’t worry about them. That’s my point – for now you don’t have to. Git will just prompt you for your GitHub password when it needs to authenticate and it is good enough for now.
Step 5: Push
Now, lets actually upload (or “push”) your code to Github:
git push origin master
Master is the name of the branch – the default one is always master, unless you have changed it. Origin is the nickname we gave to your repository in the last command. If everything worked correctly, you should see your files show up on Github.
Good news: we are done. Yep, that’s it. All it took was five commands and your code is under source control, and published on Github. You can usually accomplish this entire sequence in about a minute.
If you make changes to your code, and want to update the GitHub repository at a later time you just go back and re-use 3 of the commands you already know:
git add . git commit -m "I made changes" git push origin master
That’s it. That’s all you need to know to get started. Of course there is much more to Git than this – but you can pick up all the other useful commands later on (or not, if you don’t plan to use git often). Here is my personal cheat-sheet that includes the stuff I tend to use often.
Command line is not scary. It only looks scary and intimidating because it is empty, and you can’t just intuit your way around it and wing. Source control is probably one of these things you don’t necessarily want to “wing” though. So perhaps it is for the best. Still, once you overcome the fear of the shell, you can create a repository and get it deployed in about a minute or two with five easy commands. It is not difficult – it’s just not trivial.
Seasoned git users – would you add anything to my list? What would be the sixth command you need to know right away? Remember to keep it simple for newbs. And don’t say rebase, because I don’t think someone like Shamus would have any need for it at first.
Do you sympathize with Shamus and his Git issues? Did you go through a similar phase when you first started using git? How did you overcome it? What made it all click and fall into place for you? Do you use a GUI, and if so, which one?