I Use This!
Very High Activity

News

Analyzed about 8 hours ago. based on code collected about 11 hours ago.
Posted about 10 years ago
I've read from many that PL/V8 mathematic operations are generally faster than what you get with SQL functions and PL/pgsql functions. One area where I thought this speed would be really useful was for writing Map Algebra call-back functions. A ... [More] PostGIS 2.1+ map algebra callback function signature looks like: func_name(double precision[][][] value, integer[][] pos, text[] VARIADIC userargs) Continue reading "Writing PostGIS raster Map Algebra Callback Functions in PL/V8" [Less]
Posted about 10 years ago
I’ll be giving a tutorial at PyCon 2014 in Montréal. Register early and often!
Posted about 10 years ago
Earlier today, I wrote about a fix which broke search. At the end of that post, I mentioned a few things which needed to be done to fix up the broken relationships. clean up the database add a foreign key, on delete set null trigger to find the right value when set null Let’s get [...]
Posted over 10 years ago
Theres a lot of back and forth on NoSQL databases. The unfortunate part with all the back and forth and unclear definitions of NoSQL is that many of the valuable learnings are lost. This post isn't about the differences in NoSQL definitions, but ... [More] rather some of the huge benefits that do exist in whats often grouped into the schema-less world that could easily be applied to the relational world. Forget migrations Perhaps the best thing about the idea of a schemaless database is that you can just push code and it works. Almost exactly five years ago Heroku shipped git push heroku master letting you simply push code from git and it just work. CouchDB and MongoDB have done similar for databases... you don't have to run CREATE TABLE or ALTER TABLE migrations before working with your database. There's something wonderful about just building and shipping your application without worrying about migrations. This is often viewed as a limitation of relational databases. Yet it doesn't really have to. You see even in schema-less database the relationships are still there, its just you're managing it at the application level. There's no reason higher level frameworks or ORMs couldn't handle the migration process. As it is today the process of adding a column to a relational database is quite straightforward in a sense where it doesn't introduce downtime and is capable of letting the developer still move quickly its just not automatically baked in. # Assuming a column thats referenced doesn't exist # Automatically execute relevant bits in your ORM # This isn't code meant for you to run ALTER TABLE foo ADD COLUMN bar varchar(255); # This is near instant # Set your default value in your ORM UPDATE TABLE foo SET bar = 'DEFAULT VALUE' WHERE bar IS NULL; ALTER TABLE foo ALTER COLUMN bar NOT NULL; Having Rails/Django/(Framework of your choice) automatically notice the need for a column to exist and make appropriate modifications you could work with it the same way you would managing a document relation in your code. Sure this is a manual painful process today, but theres no reason this can't be fully handled by PostgreSQL or directly within an ORM . Documents The other really strong case for the MongoDB/CouchDB camp is document storage. In this case I'm going to equate a document directly to a JSON object. JSON itself is a wonderfully simply model that works so well for portability, and having to convert it within your application layer is well just painful. Yes Postgres has a JSON datatype, and the JSON datatype is continuing to be adopted now by many other relational databases. I was shocked to hear that DB2 is getting support for JSON myself, while I expect improvements to come to it JSON was not at the top of my list. And JSON does absolutely make sense as a data type within a column. But thats still a bit limiting as a full document store, what you want in those cases is any query result as a full JSON object. This is heavily undersold within Postgres that you can simply convert a full row to JSON with a single function - row_to_json. Again having higher level frameworks take full advantage so that under the covers you can have your strongly typed tables, but a flexibility to map them to flexible JSON objects makes a great deal of sense here. Out of the box interfaces This isn't a strict benefit of schema-less databases. Some schema-less databases have this more out of the box such as Couch where others less so. The concept of exposing a rest interface is not something new, and has been tried on top of relational databases a few times over. This is clearly something that does need to be delivered. The case for it is pretty clear, it reduces the work of people having to recreate admin screens and gives an easy onboarding process for noobs. Unfortunately there's not clear progress on this today for Postgres or other relational databases. In contrast other databases are delivering on this front often from day one :/ Where to Some of the shifts in schema-less or really in other databases in general are not so large they cannot be subsummed into a broader option. At the same time there are some strong merits such as the ones above which do take an active effort to deliver on expanding what is a "relational database". [Less]
Posted over 10 years ago
One of the legitimate points Martin Fowler has argued in favor of NoSQL databases is that expecting application to directly manipulate relational data is far less clean from an application design perspective than having a database encapsulated behind ... [More] a loosely coupled interface (like a web service).  I would actually go further and point out that such an approach invariably leads to bad database design too because the information layout becomes the contracted software API and thus one either has to spend a lot of time and effort separating logical from physical storage layouts or one ends up having an ossified physical layout that can never change.This problem has been well understood in the relational database community for a long time.  The real problem has, however, been tooling.  There are effectively two traditional tools for addressing this issue:1.  Updateable views.  These then form a relational API that allows the database to store information in a way separate from how the application sees it.  If you are using an ORM, this is a really valuable tool.2.  Stored procedures.  These provide a procedural API, but traditionally a relatively brittle one based on the same approach used by libraries.  Namely you typically have an ordered series of arguments, and all users of the API are expected to agree on the ordering and number of arguments.  While this may work passably for a single system (and even there lead to "dependency hell"), it poses significant issues in a large heterogeneous environment because the number of applications which must be coordinated in terms of updates becomes very high.  Oracle solves this using revision based editions, so you can have side-by-side versioning of stored procedures, and allows applications to specify which edition they are working on.  This is similar to side-by-side versioning of C libraries typical for Linux, or side-by-side versioning of assemblies in .Net.On the application side, ORMs have become popular, but they still lead to a relational API being contractual, so are really best used with updateable views.In part because of these shortcomings, we started writing ways around them for LedgerSMB starting with 1.3.  The implementations are PostgreSQL-specific.  More recently I wrote some Perl modules, now on CPAN, to implement these concepts.  These create the general PGObject framework, which given an application access to PostgreSQL stored procedures in a loosely coupled way.  It is hoped that other implementations of the same ideas will be written and other applications will use this framework.The basic premise is that a procedural interface that is discoverable allows for easier management of software contracts than one which is non-discoverable.  The discoverability criteria then become the software contract.PGObject allows what I call "API Paradigms" to be built around stored procedures.  An API paradigm is a consistent specification of how to write discoverable stored procedures and then re-use them in the application.  Most namespaces under PGObject represent such "paradigms."  The exceptions currently are the Type, Util, Test, and Debug second-tier namespaces.  Currently PGObject::Simple is the only available paradigm.What follows is a general writeup of the currently usable PGObject::Simple approach and what each module does:PGObjectPGObject is the bottom half module.  It is designed to service multiple top-half paradigms (the Simple paradigm is described below, but also working on a CompositeType paradigm which probably won't be ready initially yet).  PGObject has effectively one responsibility:  coordinate between application components and the database.  This is split into two sub-responsibilities:Locate and run stored proceduresEncode/decode data for running in #1 above.Specifically outside the responsibility of PGObject is anything to do with managing database connections, so every call to a database-facing routine (locating or running a stored procedure) requires a database handle to be passed to it.The reason for this is that the database handles should be managed by the application not our CPAN modules and this needs to be flexible enough to handle the possibility that more than one database connection may be needed by an application.  This is not a problem because developers will probably not call these functions unless they are writing their own top-half paradigms (in which case the number of places in their code where they issue calls to these functions will be very limited).A hook is available to retrieve only functions with a specified first argument type.  If more than one function is found that matches, an exception is thrown.The Simple top-half paradigm (below) has a total of two such calls, and that's probably typical.The encoding/decoding system is handled by a few simple rules.On delivery to the database, any parameter that can('to_db') runs that method and inserts the return value in place of the parameter in the stored procedure.  This allows one to have objects which specify how they serialize.  Bigfloats can serialize as numbers, Datetime subclasses can serialize as date or timestamp strings, and more complex types could serialize however is deemed appropriate (to JSON, a native type string form, a composite type string form, etc).On retrieval from the database, the type of each column is checked against a type registry (sub-registries may be used for multiple application support, and can be specified at call time as well).  If the type is registered, the return value is passed to the $class->from_db method and the output returned in place of the original value.  This allows for any database type to be mapped back to a handler class.Currently PGObject::Type is a reserved namespace for dealing with released type handler classes.  We have a type handler for DateTime and one for BigFloat written already and working on one for JSON database types.PGObject::SimpleThe second-level modules outside of a few reserved namespaces designate top-half paradigms for interacting with stored procedures.  Currently only Simple is supported. This must be subclassed to be used by an application and a method provided to retrieve or generate the appropriate database connection.  This allows application-specific wrappers which can interface with other db connection management logic.All options for PGObject->call_procedure supported including running aggregates, order by, etc.  This means more options available for things like gl reports database-side than the current LedgerSMB code allows.$object->call_dbmethod uses the args argument by using a hashref for typing the name to the value.  If I want to have a ->save_as_new method, I can add args => {id => undef} to ensure that undef will be used in place of $self->{id}.Both call_procedure (for enumerated arguments) and call_dbmethod (for named arguments) are supported both from the package and object.  So you can MyClass->call_dbmethod(...) and $myobj->call_dbmethod.  Naturally if the procedure takes args, you will need to specify them or it will just submit nulls.PGObject::Simple::RoleThis is a Moo/Moose role handler for PGObject::Simple. One of the main features it has is the ability to declaratively define db methods.  So instead of:sub int {    my $self = @_;    return $self->call_dbmethod(funcname => 'foo_to_int');}You can justdbmethod( int => (funcname => 'foo_to_int'));We will probably move dbmethod off into another package so that it can be imported early and used elsewhere as well.  This would allow it to be called without the outermost parentheses.The overall benefits of this framework is that it allows for discoverable interfaces, and the ability to specify what an application needs to know on the database.  This allows for many of the benefits of both relational and NoSQL databases at the same time including development flexibility, discoverable interfaces, encapsulation, and more. [Less]
Posted over 10 years ago
On 18th of January, Stephen Frost committed patch: Add ALTER TABLESPACE ... MOVE command   This adds a 'MOVE' sub-command to ALTER TABLESPACE which allows moving sets of objects from one tablespace to another. This can be extremely handy and avoids a lot of error-prone scripting. ALTER TABLESPACE ... MOVE will only move objects the […]
Posted over 10 years ago
Nowadays, failover is one of the simplest procedures in postgres. Still, there are not many ways to automate this procedure and perform it without any human interference. This blog post will guide you through a way to implement automated failover by ... [More] using two open source projects, repmgr2 and pgbouncer.PGbouncer will act like a connection pool and as an entry point to the database cluster.Repmgr now, (since version 2) supports automatic failover, the slave will try to connect to the master for a predefined time and number of attempts, if it fails, it will run a script that can be anything. We are going to use this functionality to perform auto-failover, redirect our application connections to the new master database, and notify the dba that a failover has been performed.Each server will have its own copy of pgbouncer configuration file(s) and when a server is being promoted, the new master also has the responsibility to overwrite the configuration file on the server that pgbouncer runs (failover.sh).In my example I used:3 nodes (3 debian net install virtual machines)$PGDATA configured on each node3 postgres installations. one on each noderepmgr2-beta2 or laterpgbouncer-1.5.4rsyncsshdHostnames:bouncer 192.168.0.16 witness and pgbouncer server, will act as connection to our database cluster.pgnode1 192.168.0.202 postgres database server 1pgnode2 192.168.0.203 postgres database server 2I will use these hosts as examples and as a point of reference for all the configuration files, lets start. . .Install postgres on all nodes, set passwordless ssh between the three nodes for a user that has access to write in $PGDATA and run postgres binaries.Install repmgr on all nodes, and pgbouncer only on bouncer.On pgnode1 set the following in $PGDATA/postgresql.conflisten_addresses='*'wal_level = 'hot_standby'archive_mode = onarchive_command = '/bin/false'max_wal_senders = 10wal_keep_segments = 5           # ----- FOR MY EXAMPLE 5, suggested by repmgr is bellow -------#### wal_keep_segments = 5000   # 80 GB required on pg_xloghot_standby = onshared_preload_libraries = 'repmgr_funcs'and in $PGDATA/pg_hba.conf:host    all             repmgr             192.168.0.0/16          trusthost    replication     repmgr      192.168.0.0/16           trustBring PostgreSQL up on pgnode1 and run:createuser -s repmgrcreatedb -O repmgr repmgrpsql -f </usr/share/postgresql/9.0>/contrib/repmgr_funcs.sql repmgrOn pgnode2 run:repmgr -d repmgr -U repmgr -h pgnode1 standby clone(this will use rsync)Start postgres on pgnode2, you should have a new slave.edit a repmgr.conf , whereever you like; in my case /etc/repmgr/repmgr.conf and add:cluster=my_clusternode=1#priority=-1node_name=pgnode1conninfo='host=192.168.0.202 dbname=repmgr user=repmgr'master_response_timeout=50reconnect_attempts=5reconnect_interval=1failover=automaticpromote_command='/home/vasilis/configs/failover.sh'follow_command='repmgr standby follow -f /etc/repmgr/repmgr.conf'Each node will need its own repmgr.conf file: check github link for all my config files.on pgnode1:repmgr -f /etc/repmgr/repmgr.conf master registeron pgnode2:repmgr -f /etc/repmgr/repmgr.conf standby register on bouncer (witness server):repmgr -d repmgr -U repmgr -h 192.168.0.16 -D PGDATA -f /etc/repmgr/repmgr.conf witness createon pgnode2 and on bouncer run:repmgrd -f /etc/repmgr/repmgr.conf > /var/log/postgresql/repmgr.log 2>&1 (this is the deamon for repmgr)Put the configuration file of pgbouncer from the master (pgnode1) to bouncer (pgbouncer server). Restart and test connectivity. If you stop pgnode1's postgres, pgnode2 will takeover, rsync configuration file for pgbouncer to bouncer, restart pgbouncer with ssh and mail a notification.ConclusionThe main idea behind this implementation is that pgbouncer is the only entry point from the application to the database, each node upon promotion is responsible to “show” pgbouncer where it should now connect, this way, the application can remain unaware of  the current master db.An alternative implementation could skip pgbouncer completely and each slave could directly change the configuration file of the application. In my example, I used a pgbouncer for 3 reasons: to ensure stability, to enforce connection pooling (and its advantages), and also to skip the assumption that the database stack has access to the application stack, which in my opinion shouldn’t.notes: If the master is running repmgrd and db crashes then the deamon will exit because it can’t connect to the db, needs a way to restart, possibly with postgres startup. Bouncer in our example is a single point of failure, witness db can be configured as pgnode1/2 with repmgr and a second pgbouncer on a second machine. The application should take care on how to use these 2 pgbouncers. Failover.sh runs pg_ctl promote which is a postgresql 9.1 feature, for versions prior to 9.1 you may “touch” a trigger file.Github link of all the configuration files used Thanks for reading--Vasilis Ventirozos [Less]
Posted over 10 years ago
When I say version specific here, I’m speaking of the PostgreSQL version (9.1, 9,2, etc), not the extension version. An upcoming update to PG Partition Manager takes advantage of the range data type, which is a 9.2 feature. But I also wanted to keep ... [More] backward compatibility with 9.1 by excluding this new feature. The more important thing is to get the extension installed in the first place, so the Makefile was where I started.EXTENSION = pg_partman EXTVERSION = $(shell grep default_version $(EXTENSION).control | \ sed -e "s/default_version[[:space:]]*=[[:space:]]*'\([^']*\)'/\1/") DATA = $(filter-out $(wildcard updates/*--*.sql),$(wildcard sql/*.sql)) DOCS = $(wildcard doc/*.md) SCRIPTS = bin/dump_partition.py bin/partition_data.py bin/reapply_indexes.py bin/undo_partition.py PG_CONFIG = pg_config PG91 = $(shell $(PG_CONFIG) --version | egrep " 8\.| 9\.0" > /dev/null && echo no || echo yes) PG92 = $(shell $(PG_CONFIG) --version | egrep " 8\.| 9\.0| 9\.1" > /dev/null && echo no || echo yes) ifeq ($(PG91),yes) all: sql/$(EXTENSION)--$(EXTVERSION).sql ifeq ($(PG92),yes) sql/$(EXTENSION)--$(EXTVERSION).sql: sql/types/*.sql sql/constraints/*.sql sql/tables/*.sql sql/functions/*.sql sql/92/tables/*.sql cat $^ > $@ else sql/$(EXTENSION)--$(EXTVERSION).sql: sql/types/*.sql sql/constraints/*.sql sql/tables/*.sql sql/functions/*.sql cat $^ > $@ endif DATA = $(wildcard updates/*--*.sql) sql/$(EXTENSION)--$(EXTVERSION).sql EXTRA_CLEAN = sql/$(EXTENSION)--$(EXTVERSION).sql endif PGXS := $(shell $(PG_CONFIG) --pgxs) include $(PGXS)Keep in mind, I know very little about Makefiles in the first place, so my solution is using just what I already knew how to do in the original Makefile. There may be a better way, and if there is, I’d love feedback. I just expanded on the already existing lines that check for a minimal version of 9.1 to even allow the extension to install. And I organized my code folders to have one specifically for 9.2+ features (sql/92/tables/). After that it’s just another check to see if the PG92 variable is true, and if so, grab the code from that additional folder. Now the next challenge is checking the version when installing an update to an already existing installation. First, we need a means of checking if the current version is new enough. I made up the below function for inclusion in my extension since I’ll actually be doing version checking elsewhere as well./* * Check PostgreSQL version number. Parameter must be full 3 point version. * Returns true if current version is greater than or equal to the parameter given. */ CREATE FUNCTION check_version(p_check_version text) RETURNS boolean LANGUAGE plpgsql STABLE AS $$ DECLARE v_check_version text[]; v_current_version text[] := string_to_array(current_setting('server_version'), '.'); BEGIN v_check_version := string_to_array(p_check_version, '.'); IF v_current_version[1]::int > v_check_version[1]::int THEN RETURN true; END IF; IF v_current_version[1]::int = v_check_version[1]::int THEN IF v_current_version[2]::int > v_check_version[2]::int THEN RETURN true; END IF; IF v_current_version[2]::int = v_check_version[2]::int THEN IF v_current_version[3]::int >= v_check_version[3]::int THEN RETURN true; END IF; -- 0.0.x END IF; -- 0.x.0 END IF; -- x.0.0 RETURN false; END $$;Now we can just take advantage of the DO block syntax to allow script code in a plain SQL file./********* 9.2+ stuff ********/ DO $$ BEGIN IF @[email protected]_version('9.2.0') THEN CREATE TABLE custom_time_partitions ( parent_table text NOT NULL , child_table text NOT NULL , partition_range tstzrange NOT NULL , PRIMARY KEY (parent_table, child_table)); CREATE INDEX custom_time_partitions_partition_range_idx ON custom_time_partitions USING gist (partition_range); END IF; END $$; /********* end 9.2+ stuff ********/The full implementation of this will be available in the 1.6.0 update for pg_partman that I’ll be releasing in the near future. It contains a feature I’ve wanted to get done for a while now: custom time intervals! [Less]
Posted over 10 years ago
Npgsql has new developers and users communication channels.Shay Rojansky created two new discussion groups for Npgsql: npgsql-dev and npgsql-help. Those groups will be used in place of the Npgsql-devel mailing list and the forums available on ... [More] pgfoundry.org respectively.Please, feel free to send your questions and comments to those groups. We hope those groups are very helpful to users looking for Npgsql support. [Less]
Posted over 10 years ago
With the introduction of pg_prewarm extension in Postgres, it has become very easy to save and restore the contents of Postgres server’s buffer cache across a server restart. Following is a brain-dead way of implementing hibernation, and I am sure ... [More] it can be optimized to reduce the number of calls to pg_prewarm, as it allows the caller to specify a range of blocks to be loaded. We’re going to use pg_buffercache to extract the list of buffers currently loaded in buffer cache, and after a server restart, pg_prewarm will be used to load those buffers back in. Declare environment variables to be used by scripts HIBERNATE_DESTINATION=$HOME/pg_hibernate/ export PGUSER=postgres PSQL="psql" PSQL_TEMPL_DB="$PSQL -d template1" PSQL_PG_DB="$PSQL -d postgres" PSQL_TEMPL_NOFLUFF="$PSQL_TEMPL_DB -A -t" PSQL_PG_NOFLUFF="$PSQL_PG_DB -A -t" Prepare the databases This is a one time operation. We avoid installing anything into template0 database, since it is a read-only database. But we do consciously install the extension into template1 database; this is so that any new databases created after this point will get this extension pre-installed. Install extensions for db in $($PSQL_TEMPL_NOFLUFF -c 'select datname from pg_database where datname <> $$template0$$'); do echo Installing pg_prewarm in $db $PSQL_TEMPL_DB -c 'create extension if not exists pg_prewarm;' done echo Installing pg_buffercache extension in postgres database $PSQL_PG_DB -c 'create extension if not exists pg_buffercache;' Save buffer information We are actually generating a psql script, that can be later fed to psql, as is. mkdir -p $HIBERNATE_DESTINATION for db in $($PSQL_TEMPL_NOFLUFF -c 'select datname from pg_database where datname <> $$template0$$'); do $PSQL_PG_NOFLUFF -c 'select pg_prewarm(( select oid from pg_class where pg_relation_filenode(oid) = $q$ || relfilenode || $q$)::regclass, $$buffer$$, $q$ || case relforknumber when 0 then $q$$$main$$$q$ when 1 then $q$$$fsm$$$q$ when 2 then $q$$$vm$$$q$ when 3 then $q$$$init$$$q$ end || $q$, $q$ || relblocknumber || $q$, $q$ || relblocknumber || $q$);$q$ from pg_buffercache where reldatabase = (select oid from pg_database where datname = $$'${db}'$$) order by relfilenode, relforknumber, relblocknumber;' > $HIBERNATE_DESTINATION/${db}.save done Restore the buffers We connect to each database, and simply feed the script generated earlier, into psql. for db in $($PSQL_TEMPL_NOFLUFF -c 'select datname from pg_database where datname <> $$template0$$'); do if [ ! -e $HIBERNATE_DESTINATION/${db}.save ]; then continue fi $PSQL -d $db -f $HIBERNATE_DESTINATION/${db}.save done At this point, the Postgres shared-buffers should contain all the buffers that were present when we extracted the buffer list from pg_buffercache. PS: When reviewing the pg_prewarm code, I did not think through the user-experience aspect of this extension. But after it was committed, the more I thought about how it’s going to be used, the less appealing this solution became. As is evident from above, the administrator needs the help of two extensions, and then a storage location where to store the list of buffers. Ideally, as a DBA, I would like to see a feature which doesn’t require me to muck around with catalogs etc. I have a design for such an extension, and may start coding it some time soon. [Less]