Testing with concurrent sessions [PgSql]

Prev: invalid UTF-8 via pl/perl
Next: [HACKERS] ERROR: record type has not been registered

From: Markus Wanner on 16 Jan 2010 01:51

Hi,

Kevin Grittner wrote:
> Based on Andrew's suggestion, I changed line 276 to:
>
> args=['psql', '-A', '--pset=pager=off',

That looks like a correct fix for psql, yes. Thanks for pointing that
out Andrew.

Other processes might be confused by (or at least act differently with)
a PAGER env variable, so that still needs to be cleared in general.

> I now get 5 of 6 tests succeeded (83.3%), processed in 18.5 seconds.

That's perfect. The one test that fails is expected to fail (another
thing dtester doesn't support, yet). The serialization code you write
should finally make that test pass ;-)

> I do want to expand the tests quite a bit -- do I work them all into
> this same file, or how would I proceed? I think I'll need about 20
> more tests, but I don't want to get in the way of your work on the
> framework which runs them.

Well, first of all, another piece of the missing manual: there are
BaseTest and SyncTest classes. Those based on BaseTest runs within the
event loop of the twisted framework, thus need to be written in the very
same asynchronous fashion. Mostly calling async methods that return a
Deferred object, on which you may addCallback() or addErrback(). See the
fine twisted documentation, especially the part about "Low-Level
Networking and Event Loop" here:

http://twistedmatrix.com/documents/current/core/howto/index.html

The SyncTest is based on BaseTest, but a new thread is created to run
its run method, passing back its results to the main event loop when
done. That allows you to call blocking methods without having to care
about blocking the entire event loop.

However, it makes interacting between the two models a bit complicated.
To call an async function from a SyncTest, you need to call the syncCall
method. The separate thread then waits for some callback in the main
event loop.

Both have their own set of caveats, IMO.

I'm not sure about how to organize the tests and ongoing development of
the framework. I've already broken the Postgres-R tests with dtester-0.0.

Maybe we put up a git branch with the dtester patches included? So
whenever I want to change the framework, I can check if and how it
affects your tests.

Regards

Markus

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: =?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= on 16 Jan 2010 06:11

Markus Wanner wrote:
>> I do want to expand the tests quite a bit -- do I work them all into
>> this same file, or how would I proceed? I think I'll need about 20
>> more tests, but I don't want to get in the way of your work on the
>> framework which runs them.
>
> Well, first of all, another piece of the missing manual: there are
> BaseTest and SyncTest classes. Those based on BaseTest runs within the
> event loop of the twisted framework, thus need to be written in the very
> same asynchronous fashion. Mostly calling async methods that return a
> Deferred object, on which you may addCallback() or addErrback(). See the
> fine twisted documentation, especially the part about "Low-Level
> Networking and Event Loop" here:
>
> http://twistedmatrix.com/documents/current/core/howto/index.html

> I'm not sure about how to organize the tests and ongoing development of
> the framework. I've already broken the Postgres-R tests with dtester-0.0.

Hi,

sorry to butt in to the conversation, but I have spent some time
wrapping/refining the concepts in dtester, and the results are here:

http://git.wulczer.org/?p=twisted-psql.git;a=summary

It reqires Twisted and has been tested on Python 2.5 (should work on
2.6, no idea about 3.0). The program you use to run it - trial - should
come with your distro's Twisted packages. The tests don't start a server
or anything, so you need to have a PG instance running. To try it:

git clone git://wulczer.org/twisted-psql.git
cd twisted-psql # this is important, or Python won't find the modules
$EDITOR config.py # set the path to psql and connection details for PG
trial test.test_serialization_error
trial test.test_true_serialization

Both tests should pass, the latter being marked as an expectedFailure.
You can then look at test/*.py to see my (puny) attempt at having some
abstraction layer over the asynchronocity of the tests.

I borrowed the idea of wrapping a psql in a Twisted protocol and added a
Deferred interface around it, which made it possible to run tests with
trial: the Twisted unit testing framework.

As a developer of a failry large Python system based on Twisted, that
sports hundreds of trial-based tests, I very strongly recommend trial
for asynchronous unit testing. It handles lots of boring details, is
well maintained and Twisted itself is simply designed to do asynchronous
programming. As an added bonus, the runnning and reporting
infrastructure is already there, you just write the tests.

My code is very rough and lacks good error reporting, for instance
failed tests will probably result in a "test hung" and the need to
Ctrl+C, but that can be easily improved. A thing that would help
tremendously would be a real Twisted protocol that talks to PG on the
protocol level, not by parsing psql output (which is very clumsy and
error prone IMHO).

I found one such project:
http://www.jamwt.com/pgasync/
but it had some issues with committing (all my test programs were
exiting before PG got the final COMMIT, which resulted in the
impossibility to do anything) and it does too much things that Python PG
drivers like to do (like declaring a CURSOR for each query, bleah). A
good implementation would hugely improve the quality and robustness of
any such testsuite.

Cheers,
Jan
--
Jan Urbanski
GPG key ID: E583D7D2

ouden estin

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: "Kevin Grittner" on 16 Jan 2010 08:32

Markus Wanner wrote:
Kevin Grittner wrote:

> I differentiate tests and test suites. Tests mainly have a run
> method, while test suites have setUp and tearDown ones.

I hadn't caught on to that distinction yet. That should help.

>> "uses" means that the referenced task has complimentary setUp and
>> tearDown methods, and the dependent task may only run after a
>> successful invocation of the referenced task's setUp method, and
>> the referenced task will wait for completion of all dependent
>> tasks before invoking tearDown.
>
> Absolutely correct (may I just copy that para for documentation)?
> ;-)

Use any of my language that you like, but I'm not generally known for
my word-smithing ability, so use at your own risk. ;-)

> Two additional things: tests and test suites may have requirements
> (in the form of interfaces). The used test suites are passed to the
> dependent task and it may call the referenced tasks's methods, for
> example to get the database directory or to run a certain SQL
> command.

Makes sense.

> Second, if the referenced task fails, any running dependent task is
> getting aborted as well. That might be obvious, though.

I figured that, although it's good to have it confirmed.

>> "depends" means that the tearDown method of the referenced task
>> doesn't undo the work of its setUp, at least for purposes of the
>> dependent task. The dependent task can only start after successful
>> completion of the referenced class's work (*just* setUp, or all
>> the way to tearDown?), but the referenced task doesn't need to
>> wait for the dependent task.
>
> Hm.. no, not quite. The fact that not all suites clean up after
> them has nothing to do with how they are referenced ("uses" or
> "depends"). So far, it's entirely up to the test suite. I dislike
> that, but it works. (I've been thinking about some separate
> resource allocation handling and what not, but..)
>
> The only difference between "depends" and "uses" is the
> requirements fulfilling. "uses" does that, while "depends" only
> adds the timing and functional dependencies, but doesn't pass the
> referenced task as an argument to the dependent task.

OK, that accounts for most of the differences between what they
sounded like to me and what I saw in the code. That's workable, now
that I understand it.

>> "onlyAfter" means that the dependent task must wait for completion
>> of the referenced task, but doesn't care whether or not the
>> referenced class completed successfully.
>
> That's how I think it *should* be. ATM "onlyAfter" requires
> successful completion of the dependent task.

That accounts for the rest of the differences.

> I'd like to change that to support "onlyAfter",
> "onlyAfterSuccessOf" and "onlyAfterFailureOf". Plus "onlyBefore"
> for convenience.

onlyAfterSuccessOf would be the same as depends with an empty
tearDown method? So it would effectively be syntactic sugar, for
convenience?

An onlyBefore reference from a to b would be semantically identical
to an onlyAfter reference from b to a? If so, that one seems to me
like it would muddy the waters more than it would help.

> Thank you for thinking through all of this. I'm sure you understand
> now, why it's not a version 0.1, yet :-)

Thank you for putting it together! I was afraid I was going to have
to go off-task on serializable implementation to write something so I
could test it. I'm more than happy to help stabilize your tool
instead!

-Kevin

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: "Kevin Grittner" on 16 Jan 2010 08:52

Markus Wanner wrote:
Kevin Grittner wrote:

>> args=['psql', '-A', '--pset=pager=off',

> That looks like a correct fix for psql, yes.

> Other processes might be confused by (or at least act differently
> with) a PAGER env variable, so that still needs to be cleared in
> general.

I see your point. Even with a general solution, probably best to
leave the pset there for psql, though.

> [discussion of BaseTest vs SyncTest]
>
> Both have their own set of caveats, IMO.

I'll look those over. Any caveats beyond what you already mentioned
of which I should be particularly careful?

> Maybe we put up a git branch with the dtester patches included? So
> whenever I want to change the framework, I can check if and how it
> affects your tests.

I strongly encourage you to set that up on git.postgresql.org. If
for some reason that's not practicable, I can give you write
permissions to my repository there and set up a dtester branch for
this. I've barely scratched the surface on git in the last few
weeks, and already I'm a big fan. I was never convinced that
subversion was an improvement over cvs -- they each had advantages
over the other which seemed a wash for me -- but git takes everything
to a whole new level.

-Kevin

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

From: Markus Wanner on 18 Jan 2010 11:19

Hi,

Quoting "Kevin Grittner" <Kevin.Grittner(a)wicourts.gov>:
> I strongly encourage you to set that up on git.postgresql.org.

I'm about to provide git repositories for Postgres-R anyway, so I've
setup two projects on git.postgres-r.org:

dtester: that's the driver/harness code
postgres-dtests: a Postgres clone with the dtester patch applied - this
is based on the Postgres git repository, so you can easily switch
between Postgres branches.

I'd like to clean postgres-dtests in the sense that all tests included
there are expected to succeed on Postgres HEAD. Those (like the
initial SSI ones) that are expected to fail should get marked as such
in there.

If you want to add SSI specific tests, which your are expecting to
succeed on your branch, I'd recommend you create your own branch
(or merge into your branch from postgres-dtests). Git makes that simple
enough.

> I've barely scratched the surface on git in the last few
> weeks, and already I'm a big fan. I was never convinced that
> subversion was an improvement over cvs -- they each had advantages
> over the other which seemed a wash for me -- but git takes everything
> to a whole new level.

I agree, and would like to extend that to DVCSes in general. Having
started with monotone, I'm used to a different level of convenience,
especially WRT network exchange and merging. To be fair, I'm excited
about how fast git is (especially WRT network exchange, where monotone
just plain sucks).

> I see your point. Even with a general solution, probably best to
> leave the pset there for psql, though.

Agreed, that's fixed in the new git repository.

> I'll look those over. Any caveats beyond what you already mentioned
> of which I should be particularly careful?

Uh.. no, there's no difference other than that. It's a paradigm
question. Ones like it that way, others the other. And then there are
applications that are a better fit than others...

Now, I tend towards the event based approach, because it basically
relieves you from having to deal with concurrent threads and all their
issues. You need to get a single ordering of events anyway, if you want
to check ordering constraints.

Regards

Markus

--
Sent via pgsql-hackers mailing list (pgsql-hackers(a)postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

First | Prev | Next | Last
Pages: 6 7 8 9 10 11 12 13 14 15 16 17
Prev: invalid UTF-8 via pl/perl
Next: [HACKERS] ERROR: record type has not been registered