Post by Federico Di GregorioPost by Jan UrbaÅskia couple of months ago - it's missing the query cancellation feature
that I will try to implement after reading the libpq documentation some
more.
I'll keep on working on exposing more async features of the C library
and adding more unit tests (I've started another thread with the fix for
the test suite for PG 9.0).
I'll start by this email. I don't agree on exposing more and more
features of libpq. While async is surely backend specific (especially if
done trhough libpq) I'd like to keep the API generic enough to be
implemented by other database adapters.
I agree 100%. Let's start by designing an API then, although I don't
know anything about other databases, so I won't be able to tell if
similar things would be possible for MySQL or Oracle etc.
Post by Federico Di GregorioIn theory you should be able to start a query and then manage the cursor
as any file descriptor, using select() or poll() with less possible
psycopg-specific code. I think the best would be to have an attribute on
the connection object to specify the _default_ for cursor.execute() [I
don't like the idea of completely async connections].
I'm not entirely sure that's the case. Just grabbing the socket file
descriptor and putting it in a select() without using the
library-specific APIs might break that library's expectations of how
things should be done.
So you're saying you'd prefer a way to tell the connection which kind of
cursors should it be creating - that's OK for me. Would you like that
property to be changeable during the lifetime of a connection? There is
no technical reason to forbid that, but I'd be equally happy with just
having to choose sync vs async at connection creation time.
Post by Federico Di GregorioAlso, having a way to tell from the connection if an async query is
running seems a good idea, so, probably .isready() should exist both at
the cursor and connection level.
Yeah, asking the connection if it's doing an async query at a given
moment would be useful.
Post by Federico Di GregorioIf reactors and coroutines require additional methods then we should
provide *one* API that expose enough functionality for all of them (but
I don't use such stuff so someone else will be required to do this work).
I think that deep down under, the frameworks need a way to get a file
descriptor that can be put in a select(), a way to figure out if they
should be be waiting for the fd to become readable or writable and a
method to call after the select() call returns. At least I'm sure that
would be enough for Twisted to be able to wrap such a driver.
I have since implemented a way to build connections asynchronously and
the API looks like this:
conn = psycopg2.connect(tests.dsn, async=True)
state = psycopg2.extensions.POLL_WRITE
while state != psycopg2.extensions.POLL_OK:
if state == psycopg2.extensions.POLL_WRITE:
select.select([], [conn.fileno()], [])
elif state == psycopg2.extensions.POLL_READ:
select.select([conn.fileno()], [], [])
state = conn.poll()
so you create an async connection and then loop in a select() until the
driver tells you that your connection has been built. This required
adding two additional methods on the connection - fileno() and poll().
I'm not sure if poll() is the best name, isready() would be another
choice, but it can't simply return a boolean: libpq support for async
connection building requires you to switch between waiting on write and
read on the connection socket.
So how about we try to design an async API that would make sense and be
useful for both Twisted and Eventlet. Just to get the ball rolling I'll
try to start something, I'd love to hear from the Eventlet and know if
that would be enough for them.
psycopg2.connect() gets a new kwarg, async=False. If it's True, the
connect() call returns immediately and the returned Connection object is
in a state where calling any method except for poll() or close() raises
an exception.
Connection.poll() (or .isready()) returns one of the three values,
POLL_OK, POLL_WRITE or POLL_READ that tell you if you should put the
connection socket in the readfds or the writefds set of select(). It can
also raise an exception, which would be analogous to getting an
exception using a normal blocking call to psycopg2.connect()
Connection.fileno() returns the fd associated with the connection socket.
Connection.issync() returns a boolean saying if the connection is sync
or not.
Connection.executing() returns a boolean saying if there's an async
query being execute at the moment, i.e. if a call to execute() on a
async cursor for this connection will raise an exception (or block) or
not. FIXME: get a better name for that. Maybe use isready() and make use
poll() for connection building?
Connection.cursor() loses the async kwarg. Instead all cursors created
by an async connection are async. Trying to create a named cursor with
an async connection raises an exception (if people want real cursors
they will have to issue the DECLARE CURSOR themselves - sucks, but
that's a first approach).
Connection.commit() and rollback() blocks, throws away any asynchronous
result and commits/rolls back the transaction. If I'd be using that API
I would set it to autocommit mode anyway and do my own BEGINs and
COMMITs, I think.
Connection.set_isolation_level() blocks.
Connection.set_client_encoding() blocks.
Connection.lobject() and large objects in general work the same way as
with a sync connection.
Cursor.execute() on an async cursor returns immediately. If there's
already an async query underway it either throws an exception or blocks
until the previous query has completed, then throws away its result and
runs another execute(). I actually kind of like the latter approach,
because then you can use the async cursors in almost the same way as the
sync ones.
Cursor.isready() if there's no async query in progress tries to flush
the connection's write buffer and returns a boolean saying whether thhe
whole query has been sent already. If there's an async query in progress
it tries to get some data and returns a boolean saying whether it
managed to get all of it. You will still need two select() loops to
safely issue a query, but that's unavoidable: you have to send it and
read the result. But at least you can have just one method to do that.
Cursor.callproc() works just like execute().
Connection.fileno() returns the fd associated with the connection socket.
Cursor.executemany() on an async cursor raises an exception.
Cursor.fetch*() blocks until it gets the result and gives it to the client.
Cursor.scroll() also blocks and then does the scroll.
Cursor.copy.from/to/expert always blocks, at least for now.
I'll publish my psycopg2 branch after I clean it up a bit, but I would
love to agree on an API before implementing the whole thing. Right now I
have async connection building and an issent() method on the cursor, and
I can now see that it should be simply folded into isready().
Cheers,
Jan