antisocial things you can do in git (but not CVS) [PgSql]

Prev: [HACKERS] antisocial things you can do in git (but not CVS)
Next: [HACKERS] Finding slave WAL application time delay

From: Magnus Hagander on 20 Jul 2010 14:42

On Tue, Jul 20, 2010 at 20:34, Robert Haas <robertmhaas(a)gmail.com> wrote:
> I have some concerns related to the upcoming conversion to git and how
> we're going to avoid having things get messy as people start using the
> new repository. �git has a lot more flexibility and power than CVS,
> and I'm worried that it would be easy, even accidentally, to screw up
> our history.
>
> 1. Inability to cleanly and easily (and programatically) identify who
> committed what. �With CVS, the author of a revision is the person who
> committed it, period. �With git, the author string can be set to
> anything the person typing 'git commit' feels like. �I think there is
> also a committer field, but that doesn't always appear and I'm not
> clear on how it works. �Also, the author field defaults to something
> dumb if you don't explicitly set it to a value. �So I'm worried we
> could end up with stuff like this in the repository:

I'm pretty sure we can enforce this on the server side, refusing
commits that don't follow our standard. I haven't done it myself
(yet), but I've read about it.

> My preference would be to stick to a style where we identify the
> committer using the author tag and note the patch author, reviewers,
> whether the committer made changes, etc. in the commit message. �A
> single author field doesn't feel like enough for our workflow, and
> having a mix of authors and committers in the author field seems like
> a mess.

+1.

> 2. Branch and tag management. �In CVS, there are branches and tags in
> only one place: on the server. �In git, you can have local branches
> and tags and remote branches and tags, and you can pull and push tags
> between servers. �If I'm working on a git repository that has branches
> master, REL9_0_STABLE .. REL7_4_STABLE, inner_join_removal,
> numeric_2b, and temprelnames, I want to make sure that I don't
> accidentally push the last three of those to the authoritative
> server... but I do want to push all the others. �Similarly I want to
> push only the corrects subset of tags (though that should be less of
> an issue, at least for me, as I don't usually create local tags). �I'm
> not sure how to set this up, though.

We could put a safeguard in place on the server that won't let you
push a branch and require that additions of new branches be done
manually on the server?

> 3. Merge commits. �I believe that we have consensus that commits
> should always be done as a "squash", so that the history of all of our
> branches is linear. �But it seems to me that someone could
> accidentally push a merge commit, either because they forgot to squash
> locally, or because of a conflict between their local git repo's
> master branch and origin/master. �Can we forbid this?

Again, I haven't done it, but I've read about it, and I'm almost
certain we can enforce this, yes.

> 4. History rewriting. �Under what circumstances, if any, are we OK
> with rebasing the master? �For example, if we decide not to have merge
> commits, and somebody does a merge commit anyway, are we going to
> rebase to get rid of it?

That's something we need a good policy for. Merge commits are special.
For content commits, I think we should basically *never* do that. If
someone commits bad content, we should just make a revert commit which
keeps history linear and just removes the changes as a new commit.

--
�Magnus Hagander
�Me: http://www.hagander.net/
�Work: http://www.redpill-linpro.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Peter Eisentraut on 20 Jul 2010 15:12

On tis, 2010-07-20 at 14:34 -0400, Robert Haas wrote:
> Right now, it's easy to find all the commits by a particular
> committer, and it's easy to see who committed a particular patch, and
> the number of distinct committers is pretty small. I'd hate to give
> that up.
>
> git log | grep '^Author' | sort | uniq -c | sort -n | less

git log --format=full | grep '^Commit' | sort | uniq -c | sort -n | less

> My preference would be to stick to a style where we identify the
> committer using the author tag and note the patch author, reviewers,
> whether the committer made changes, etc. in the commit message. A
> single author field doesn't feel like enough for our workflow, and
> having a mix of authors and committers in the author field seems like
> a mess.

Well, I had looked forward to actually putting the real author into the
author field.

> 2. Branch and tag management. In CVS, there are branches and tags in
> only one place: on the server. In git, you can have local branches
> and tags and remote branches and tags, and you can pull and push tags
> between servers. If I'm working on a git repository that has branches
> master, REL9_0_STABLE .. REL7_4_STABLE, inner_join_removal,
> numeric_2b, and temprelnames, I want to make sure that I don't
> accidentally push the last three of those to the authoritative
> server... but I do want to push all the others. Similarly I want to
> push only the corrects subset of tags (though that should be less of
> an issue, at least for me, as I don't usually create local tags). I'm
> not sure how to set this up, though.

I'm going to use one separate clone for my development and one
"pristine" one for the final commits and copy the patches over manually.
That also solves the next problem ...

> 3. Merge commits. I believe that we have consensus that commits
> should always be done as a "squash", so that the history of all of our
> branches is linear. But it seems to me that someone could
> accidentally push a merge commit, either because they forgot to squash
> locally, or because of a conflict between their local git repo's
> master branch and origin/master. Can we forbid this?

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Robert Haas on 20 Jul 2010 17:21

On Tue, Jul 20, 2010 at 2:42 PM, Magnus Hagander <magnus(a)hagander.net> wrote:
> On Tue, Jul 20, 2010 at 20:34, Robert Haas <robertmhaas(a)gmail.com> wrote:
>> I have some concerns related to the upcoming conversion to git and how
>> we're going to avoid having things get messy as people start using the
>> new repository. �git has a lot more flexibility and power than CVS,
>> and I'm worried that it would be easy, even accidentally, to screw up
>> our history.
>>
>> 1. Inability to cleanly and easily (and programatically) identify who
>> committed what. �With CVS, the author of a revision is the person who
>> committed it, period. �With git, the author string can be set to
>> anything the person typing 'git commit' feels like. �I think there is
>> also a committer field, but that doesn't always appear and I'm not
>> clear on how it works. �Also, the author field defaults to something
>> dumb if you don't explicitly set it to a value. �So I'm worried we
>> could end up with stuff like this in the repository:
>
> I'm pretty sure we can enforce this on the server side, refusing
> commits that don't follow our standard. I haven't done it myself
> (yet), but I've read about it.

+1, though I see downthread that Peter has a contrary opinion.

>> My preference would be to stick to a style where we identify the
>> committer using the author tag and note the patch author, reviewers,
>> whether the committer made changes, etc. in the commit message. �A
>> single author field doesn't feel like enough for our workflow, and
>> having a mix of authors and committers in the author field seems like
>> a mess.
>
> +1.
>
>
>> 2. Branch and tag management. �In CVS, there are branches and tags in
>> only one place: on the server. �In git, you can have local branches
>> and tags and remote branches and tags, and you can pull and push tags
>> between servers. �If I'm working on a git repository that has branches
>> master, REL9_0_STABLE .. REL7_4_STABLE, inner_join_removal,
>> numeric_2b, and temprelnames, I want to make sure that I don't
>> accidentally push the last three of those to the authoritative
>> server... but I do want to push all the others. �Similarly I want to
>> push only the corrects subset of tags (though that should be less of
>> an issue, at least for me, as I don't usually create local tags). �I'm
>> not sure how to set this up, though.
>
> We could put a safeguard in place on the server that won't let you
> push a branch and require that additions of new branches be done
> manually on the server?

On this one, I'd just like a way to prevent accidents. Is there maybe
a config option I can set on my local repository?

>> 3. Merge commits. �I believe that we have consensus that commits
>> should always be done as a "squash", so that the history of all of our
>> branches is linear. �But it seems to me that someone could
>> accidentally push a merge commit, either because they forgot to squash
>> locally, or because of a conflict between their local git repo's
>> master branch and origin/master. �Can we forbid this?
>
> Again, I haven't done it, but I've read about it, and I'm almost
> certain we can enforce this, yes.

OK, that sounds good...

>> 4. History rewriting. �Under what circumstances, if any, are we OK
>> with rebasing the master? �For example, if we decide not to have merge
>> commits, and somebody does a merge commit anyway, are we going to
>> rebase to get rid of it?
>
> That's something we need a good policy for. Merge commits are special.
> For content commits, I think we should basically *never* do that. If
> someone commits bad content, we should just make a revert commit which
> keeps history linear and just removes the changes as a new commit.

Yeah, I agree. I'm not sure that merge commits are the ONLY situation
where we'd want to do this, but it should be reserved for cases where
just reversing out the diff wouldn't be sufficient for some reason and
we need to make it as though it never happened. I don't think it's
probably necessary to disallow this completely - the default setting
of allowing it only with + is probably enough.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Robert Haas on 20 Jul 2010 17:22

On Tue, Jul 20, 2010 at 3:12 PM, Peter Eisentraut <peter_e(a)gmx.net> wrote:
> Well, I had looked forward to actually putting the real author into the
> author field.

What if there's more than one? What if you make changes yourself?
How will you credit the reviewer?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Andrew Dunstan on 20 Jul 2010 18:31

Robert Haas wrote:
> I have some concerns related to the upcoming conversion to git and how
> we're going to avoid having things get messy as people start using the
> new repository. git has a lot more flexibility and power than CVS,
> and I'm worried that it would be easy, even accidentally, to screw up
> our history.
>
> 1. Inability to cleanly and easily (and programatically) identify who
> committed what.
>

Each committer sets their name and email using git config. Doesn't look
like a problem. We don't have such a large number of committers that
this should be much of an issue. Maybe we can set a pre-receive hook to
make sure that it's set appropriately?

> 2. Branch and tag management.
>
[snip]

I'm inclined to say that as now committers should not normally push
tags. Marc or whoever is managing things should create the various tags.
I think our current tagging policy is about right. "git push" doesn't
push tags by default, so you'd have to be trying hard to mess this up.
> 3. Merge commits. I believe that we have consensus that commits
> should always be done as a "squash", so that the history of all of our
> branches is linear. But it seems to me that someone could
> accidentally push a merge commit, either because they forgot to squash
> locally, or because of a conflict between their local git repo's
> master branch and origin/master. Can we forbid this?
>

Again, a pre-receive hook might be able to handle this. See
<http://progit.org/book/ch7-4.html>
> 4. History rewriting. Under what circumstances, if any, are we OK
> with rebasing the master? For example, if we decide not to have merge
> commits, and somebody does a merge commit anyway, are we going to
> rebase to get rid of it?
>
>

In the end, if we screw up badly enough we can just roll things back. It
would be a pain, but not insurmountably so. I think we need to expect
that there will be some teething issues. I keep 7 days worth of backups
of the CVS repo constantly now, and I'll probably do the same with git,
and I'm sure there will be other backups.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

| Next | Last
Pages: 1 2 3 4 5
Prev: [HACKERS] antisocial things you can do in git (but not CVS)
Next: [HACKERS] Finding slave WAL application time delay