Git: Part 2 - Some essential concepts

Overview

Featured Image "Essentials" by all black long johns is licensed under CC BY-NC-SA 2.0

In my last post, I introduced you to Git. We talked about its history briefly and how useful it is. In this one, let us explore some key terminology and concepts in Git. That way we have a common language to talk about Git.

Remote

This is what we generally call the copy of the repository that is maintained somewhere on the server.

  • It could be the ultimate source of truth repository, the one, which all the build and deployment pipelines are configured against.
  • Or it could be just another copy of the original that you made a copy of in another location because you planned to work on some significantly fundamental changes to the code base, that will need some time and thorough testing to get out to production.

In short this is any copy of the repository that is not on your local machine, but the one to which you intend to push your changes to, once you are done working on the feature.

Thus, your local repository generally tracks something on the remote. And you can have multiple remotes too. You access this report repository using a URL. Most git hosting platforms, support SSH and HTTPS protocols. Learn more about the configurations in the earlier section.

Git hosting platforms

I just mentioned Git hosting platforms casually and may or may not have confused you. Git was developed to be distributed from ground up. Thus it is not rare to see small groups of people working on different versions of the same product in different copies of the original code base on different servers based on the choices they made as groups.
Thus, there are several companies formed around enabling developers to create, host, collaborate and deploy applications that have an underlying git repository. Some popular ones are listed below, but there are many more and I don't even know how many as it is not important until you are actually working on a project that is hosted by that service.

  • Gitlab
  • Github
  • BitBucket
  • Azure Devops

Git Bash

Uses mintty as the terminal emulator for Microsoft Windows to emulate a Bash (Bourne Again Shell) experience.

A shell is a terminal application that enables you to talk to an underlying operating system through written commands. Bash is a popular default shell for several different Linux distributions.

Git is often installed on windows as part of some other software, like Visual Studio has its own little git installation and abstracts all the underlying git functionality through fnacy looking user interface. When learning Git, this means, you learn the user interface, and not the underlying tool. So it is good to use the command line to learn the tool and then switch to a Graphical User Interface (GUI) for convenience.

Some Linux Commands

Remember some Linux commands? Have you never tried it yet? Not a problem at all. Let us check out some basic Linux commands that can help you on git bash command line:

  • ls - list everything the path to the screen
  • cd - change directory
  • rm - remove
  • grep - search

All the commands listed above are general linux comands and you can find plenty of information about them online. Just use your favourite search engine.

Copying a remote repository

Two of the things you'll instantly notice when you use a git hosting platform are the terms: Fork and Clone. Both mean creating a copy but the destination is what changes the meaning.

Fork

  • Creates a copy of the repository in another location on the same hosting platform
  • This is your personal copy of the project in the same git hosting platform
  • Make changes to it without affecting the original repository

Clone

  • Copy the repository to your local machine
  • This will enable you to make changes to the code in your local workstation
  • Most IDEs support cloning from within
    • git clone https://www.github.com/my-repository

Getting updates from remote

Fetch

  • git fetch
    • What's up remote repository? Any new branches there? just asking

Pull

  • git pull remote branchname
    • Hey, what's up remote repository? Give me all you've got there, I'll get it integrated locally

Push

  • git push remote branchname
    • Hey there, I've got this cool stuff, keep it on remote for me, will you?

How's it stored?

Trees and Blobs

Trees and blobs are how git stores information. Git stores data as a directed acyclic graph. Think One Direction here. Keeps going forward, keeping a way to trace back to the previous version. Every action done in git, adds data to git's database

All information in git is of the following types:

  • Commit: a pointer to trees and blobs
  • Tree: a group of blobs
  • Blob: contents of the file that was modified

Commits

_Commits _are what you do to store data into git's database. They represent a point in the tree that is a snapshot of the project at a certain time.

  • Holds a reference to the previous commit
  • Knows the author of the commit
  • And most often has a commit message - a description of the change.
screenshot of a commit hash

They are uniquely identified by a long SHA1 hash of the contents of the commit. Two identical blobs should have the same hash.

States

Three different states of files/changes in Git.

  • Committed
    • data safely stored in the local git database
  • Modified
    • file has changed, but not yet committed
  • Staged
    • marked some files as modified to go into your commit snapshot

They also correspond to three different Areas.

three different git areas

Git Database

Git does not run a server application to track your commits and changes that are made to your project. It tracks everything inside your project’s directory itself. If you open your project directory and check for a hidden directory called .git you’ll find a lot of interesting stuff in there

DO NOT delete this directory.

Let us get hands on in the next post.

Posts in this Series

comments powered by Disqus