I Use This!
Very High Activity

News

Analyzed about 12 hours ago. based on code collected about 21 hours ago.
Posted over 10 years ago
Nordic PGDay 2014 will be held in Stockholm, Sweden, at the Hilton Stockholm Hotel, on March 20, 2014. It will feature a full day with a single track of PostgreSQL presentations from both local and global speakers, covering a wide range of topics. ... [More] For more information about the conference, please see our website. We are currently accepting proposals for presentations at the conference. We are interested in submissions from both seasoned PostgreSQL experts and people new in the community, from both locals and representatives of the global community. In short, from anybody who has an interesting story to tell about PostgreSQL, whether deeply technical or story about a successful (or failed) usage. All presentations are 45 minutes, with time for questions. Please see the website for details. We have also opened up for Early Bird registration. Early bird registration costs EUR 40, and is available until Feb 16th, after which the price goes up to EUR 60. There is a limited number of total space available at the venue, so we suggest you register early! Finally, we are also looking for sponsors! We have two different levels of sponsorship, to make sure there is a choice for everybody. If you are interested, please see the sponsor page. We look forward to seeing you in Stockholm in March! [Less]
Posted over 10 years ago
Postgres’ parameters have an associated context, which determines when that parameter can be changed. You can see the context of every parameter using the following query. A sample output is also shown. select name, context from pg_settings order ... [More] by category; name | context ---------------------------------+------------ autovacuum_freeze_max_age | postmaster autovacuum_max_workers | postmaster autovacuum_vacuum_threshold | sighup ... IntervalStyle | user ... server_encoding | internal ... lc_messages | superuser ... local_preload_libraries | backend ... The possible values of context are: internal postmaster sighup backend superuser user The above list is in order of when a parameter can be set; if a parameter can be changed in a certain context, then it can be changed at any of the earlier contexts as well. internal The internal parameters cannot be changed; these are usually compile-time constants. If you want to change any of these, you’ll have to change it in Postgres source code and compile a new set of Postgres executables. postmaster The postmaster parameters can be set at Postgres startup, or during source code compilation. (Postmaster is the parent process of all the Postgres processes, hence the context’s name). These parameters can be set in the postgresql.conf file or on the command-line when starting the Postgres server. sighup The sighup parameters can be changed while the server is running, at Postgres startup, or during code compilation. To change such a parameter, you can change it in the postgresql.conf file and send a SIGHUP signal to the Postmaster process. An easy way to send the SIGHUP signal to the Postmaster process is to use pg_ctl or your distribution’s Postgres-init script, like so: pg_clt -D $PGDATA reload OR sudo service postgresql-9.3 reload backend The backend parameters can be changed/set while making a new connection to Postgres, and never after that (and these can be changed by SIGHUP, at Postgres startup, or during code compilation). Usually the applications set these parameters while making the initial connection. An example is the local_preload_libraries parameter. Say, if you want to try a plugin for just one session, then you can initiate a psql session, with that plugin loaded for the connection, like so: PGOPTIONS="-c local_preload_libraries=my_plugin" psql The above method of changing parameters is possible for any application (for eg. pgAdmin that uses libpq library to connect to Postrges, since the PGOPTIONS environment variable is recognized and honored by libpq. Other applications/libraries may have their own methods to allow changing parameters during connection initiation. superuser To change a superuser parameter, one needs to have superuser privileges in Postgres. These parameters can be changed while a session is in progress, during a backend startup, using SIGHUP, at Postgres startup, or during code compilation. Note that normal users cannot change these parameters. user The parameters with user context can be changed by any user, at any time, to affect the current session they are connected to. Needless to say that since this is the last context in the list, a parameter that is marked as user context, can be changed using any of the methods shown for the other contexts. The SET command can be used to change a user context parameter’s value, for eg.: SET work_mem = '32 MB'; Using context in queries Although, as explained above, there is a certain order in the values of context, there is no built-in way for one to see this order, and exploit that knowledge using queries. Say, if one wants to see a list of all parameters that cannot be changed by a normal user, there’s no straight-forward way to do it. To that end, I use the following enum type and use it in queries to extract that information easily: create type guc_context as enum ( 'internal', 'postmaster', 'sighup', 'backend', 'superuser', 'user'); select name as cannot_be_changed_by_user, context from pg_settings where context::guc_context < 'user'; Other useful information that can now be easily extracted using this enum: select name as parameter, context_enum > 'internal' as can_be_changed, context_enum = 'postmaster' as change_requires_restart, context_enum >= 'sighup' as can_be_changed_by_reload from (select name, context::guc_context as context_enum from pg_settings) as v; parameter | can_be_changed | change_requires_restart | can_be_changed_by_reload ---------------------------------+----------------+-------------------------+-------------------------- allow_system_table_mods | t | t | f application_name | t | f | t archive_command | t | f | t archive_mode | t | t | f archive_timeout | t | f | t array_nulls | t | f | t authentication_timeout | t | f | t ... [Less]
Posted over 10 years ago
I’ll be speaking at FOSDEM 2014 in Brussels! I’ll be giving two talks: “The Worst Day of Your Life” (on finding and recovering from PostgreSQL data corruption and bugs) on January 31st at the Raddison Blu hotel. “Real-Life PostgreSQL JSON” on February 1 in the PostgreSQL devroom at the main FOSDEM conference.
Posted over 10 years ago
Many people use PostgreSQL to store data and do some basic analysis. However, there is more to data than many people might think. Bendorf’s law, for instance, is a mechanism, which is widely used in many fields. Especially accounting fraud and so on can be detected nicely using this very simple law of mathematics. What […]
Posted over 10 years ago
Happy new year, everyone!  As of today, you only have two weeks left to submit a talk to pgCon, the PostgreSQL Developer Conference.  You know you want to dazzle other Postgres geeks with your database wizardry, so why don't you give it a try?  Submit now.
Posted over 10 years ago
AWS is, in my humble opinion, pricy. However they provides a nice alternative to the on-demand EC2 instances most people are familiar with: Spot instances. In essence, spot instances allow you to bid on otherwise compute idle time. Recent changes ... [More] to the web console seem to highlight spot instances a bit more than they used to, but I still don't see them mentioned often. The advantage is you get the same selection of instance sizes, and they perform the same as a normal on-demand instance for (usually) a fraction of the price. The downside is that they may disappear at a moment's notice if there's not enough spare capacity when someone else spins up a normal on-demand instance, or simply outbids you. It certainly happened to us on occasion, but not as frequently as I originally expected. They also take a couple minutes to evaluate the bid price when you put in a request, which can be a bit of a surprise if you're used to the almost-instantaneous on-demand instance provision time. We made extensive use of spot instances in some of the software cluster testing we recently did. For our purposes those caveats were no big deal. If we got outbid, the test could always be restarted in a different Availability Zone with a little more capacity, or we just waited until the demand went down. At the height of our testing we were spinning up 300 m1.xlarge instances at once. Even when getting the best price for those spot instances, the cost of running a cluster that large adds up quickly! Automation was very important. Our test scripts took hold of the entire process, from spinning up the needed instances, kicking off the test procedure and keeping an eye on it, retrieving the results (and all the server metrics, too) once done, then destroying the instances at the end. Here's how we did it:In terms of high level key components, first, that test driver script was home-grown and fairly specific to the task. Something like Chef could have been taught to spin up spot instances, but those types of configuration management tools are better at keeping systems up and running. We needed the ability to run a task, and immediately shut down the instances when done. That script was written in Python, and leans on the boto library to control the instances. Second, a persistent 'head node' was kept up and running as a normal instance. This ran a Postgres instance and provided a centralized place for the worker nodes to report back to. Why Postgres? I needed a simple way to number and count the nodes in a way immune to race conditions, and sequences were what came to mind. It also gave us a place to collect the test results and system metrics, and compress down before transferring out from AWS. Third, customized AMI's were used. Why script the installation of ssh keys, Java, YCSB or Cassandra or whatever, system configurtion like our hyper 10-second interval sysstat, application parameters, etc, onto each of those 300 stock instances? Do it once on a tiny micro instance, get it how you want it, and snapshot the thing into an AMI. Everything's ready to go from the start. There, those are the puzzle pieces. Now how does it all fit together? When kicking off a test we give the script a name, a test type, a data node count, and maybe a couple other basic parameters if needed. The script performs the calculations for dataset size, number of client nodes needed to drive those data nodes, etc. Once it has all that figured out the script creates two tables in Postgres, one for data nodes and one for client nodes, and then fires off two batch requests for spot instances. We give them a launch group name to make sure they're all started in the same AZ, our customized AMI, a bit of userdata, and a bid price: max_price = '0.10' instance_type = 'm1.xlarge' #(snip) ec2.request_spot_instances( max_price, ami, instance_type=instance_type, count=count, launch_group=test_name, availability_zone_group=test_name, security_groups=['epstatic'], user_data=user_data, block_device_map=bdmap )Okay, at this point I'll admit the AMI's weren't quite that simple, as there's still some configuration that needs to happen on instance start-up. Luckily AWS gives us a handy way to do that directly from the API. When making its request for a bunch of spot instances, our script sets a block of userdata in the call. When userdata is formulated as text that appears to be a script -- starting with a shebang, like #!/bin/bash -- that script is executed on first boot. (If you have cloud-init in your AMI's, to be specific, but that's a separate conversation.) We leaned on that to relay test name and identifier, test parameters, and anything else our driver script needed to communicate to the instances at start. That thus became the glue that tied the instances back to the script execution. It also let us run multiple tests in parallel. You may have also noticed the call explicitly specifies the block device map. This overrides any default mapping that may (or may not) be built into the selected AMI. We typically spun up micro instances when making changes to the images, and as those don't have any instance storage available we couldn't preconfigure that in the AMI. Setting it manually looks something like: from boto.ec2.blockdevicemapping import BlockDeviceMapping, BlockDeviceType bdmap = BlockDeviceMapping() sdb = BlockDeviceType() sdb.ephemeral_name = 'ephemeral0' bdmap['/dev/sdb'] = sdb sdc = BlockDeviceType() sdc.ephemeral_name = 'ephemeral1' bdmap['/dev/sdc'] = sdc sdd = BlockDeviceType() sdd.ephemeral_name = 'ephemeral2' bdmap['/dev/sdd'] = sdd sde = BlockDeviceType() sde.ephemeral_name = 'ephemeral3' bdmap['/dev/sde'] = sdeThen, we wait. The process AWS goes through to evalutate, provision, and boot takes a number of minutes. The script actually goes through a couple of stages at this point. Initially we only watched the tables in the Postgres database, and once the right number of instances reported in, the test was allowed to continue. But we soon learned that not all EC2 instances start as they should. Now the script gets the expected instance ID's, and tells us which ones haven't reported in. If a few minutes pass, and one or two still aren't reporting in (more on that in a bit) we know exactly which instances are having problems, and can fire up replacements. An example output from the test script log, if i-a2c4bfd1 doesn't show up soon and we can't connect to it ourselves, we can be confident it's never going to check in: 2014-01-02 05:01:46 Requesting node allocation from AWS... 2014-01-02 05:02:50 Still waiting on start-up of 300 nodes... 2014-01-02 05:03:51 Still waiting on start-up of 5 nodes... 2014-01-02 05:04:52 Checking that all nodes have reported in... 2014-01-02 05:05:02 I see 294/300 data servers reporting... 2014-01-02 05:05:02 Missing Instances: i-e833499b,i-d63349a5,i-d43349a7,i-c63349b5,i-a2c4bfd1,i-d03349a3 2014-01-02 05:05:12 I see 294/300 data servers reporting... 2014-01-02 05:05:12 Missing Instances: i-e833499b,i-d63349a5,i-d43349a7,i-c63349b5,i-a2c4bfd1,i-d03349a3 2014-01-02 05:05:22 I see 296/300 data servers reporting... 2014-01-02 05:05:22 Missing Instances: i-e833499b,i-c63349b5,i-a2c4bfd1,i-d63349a5 2014-01-02 05:05:32 I see 298/300 data servers reporting... 2014-01-02 05:05:32 Missing Instances: i-a2c4bfd1,i-e833499b 2014-01-02 05:05:42 I see 298/300 data servers reporting... 2014-01-02 05:05:42 Missing Instances: i-a2c4bfd1,i-e833499b 2014-01-02 05:05:52 I see 299/300 data servers reporting... 2014-01-02 05:05:52 Missing Instances: i-a2c4bfd1 2014-01-02 05:06:02 I see 299/300 data servers reporting... 2014-01-02 05:06:02 Missing Instances: i-a2c4bfd1 2014-01-02 05:06:12 I see 299/300 data servers reporting... 2014-01-02 05:06:12 Missing Instances: i-a2c4bfd1 2014-01-02 05:06:22 I see 299/300 data servers reporting... 2014-01-02 05:06:22 Missing Instances: i-a2c4bfd1 2014-01-02 05:06:32 I see 299/300 data servers reporting... 2014-01-02 05:06:32 Missing Instances: i-a2c4bfd1Meanwhile on the AWS side, as each instance starts up that userdata mini-script writes out its configuration to various files. The instance then kicks off a phone home script, which connects back to the Postgres instance on the head node, adds its own ID, IP address, and hostname, and receives back its node number. (Hurray INSERT ... RETURNING!) It also discovers any local instance storage it has, and configures that automatically. The node is then configured for its application role, which may depend on what it's discovered so far. For example, nodes 2-n the Cassandra cluster will look up the IP address for node 1, and use that for its gossip host, as well as use their node numbers for the ring position calculation. Voila, hands-free cluster creation for Cassandra, MongoDB, or whatever we need. Back on the script side, once everything's reported in and running as expected, a sanity check is run on the nodes. For example with Cassandra it checks that the ring reports the correct number of data nodes, or similarly for MongoDB that the correct number of shard servers are present. If something's wrong, the human that kicked off the test (who hopefully hasn't run off to dinner expecting that all is well at this point) is given the opportunity to correct the problem. Otherwise, we continue with the tests, and the client nodes are all instructed to begin their work at the same time, beginning with the initial data load phase. Coordinated parallel execution isn't easy. Spin off threads within the Python script and wait until each returns? Set up asynchronous connections to each, and poll to see when each is done? Nah, pipe the node IP address list using the subprocess module to: xargs -P (node count) -n 1 -I {} ssh root@{} (command)It almost feels like cheating. Each step is distributed to all the client nodes at once, and doesn't return until all of the nodes complete. Between each step, we perform a little sanity check, and push out a sysstat comment. Not strictly necessary, but if we're looking through a server's metrics it makes it easy to see which phase/test we're looking at, rather than try to refer back to timestamps. run_across_nodes(data_nodes+client_nodes, "/usr/lib/sysstat/sadc -C \\'Workload {0} finished.\\' -".format(workload))When the tests are all done, it's just a matter of collecting the test results (the output from the load generators) and the metrics. The files are simply scp'd down from all the nodes. The script then issues terminate() commands to AWS for each of the instances it's used, and considers itself done. Fun AWS facts we learned along the way:Roughly 1% of the instances we spun up were duds. I didn't record any hard numbers, but we routinely had instances that never made it through the boot process to report in, or weren't at all accessible over the network. Occasionally it seemed like shortly after those were terminated, a subsequent run would be more likely to get a dud instance. Presumably I was just landing back on the same faulty hardware. I eventually learned to leave the dead ones running long enough to kick off the tests I wanted, then terminate them once everything else was running smoothly. On rare occasion, instances were left running after the script completed. I never got around to figuring out if it was a script bug or if AWS didn't act on a .terminate() command, but I soon learned to keep an eye on the running instance list to make sure everything was shut down when all the test runs were done for the day. [Less]
Posted over 10 years ago
Well I've finally fulfilled one of my 2013 New Year's resolutions, which was to write some kind of foreign data wrapper , so with no further ado here is the initial implementation . It is basically working but is still very much in the ... [More] proof-of-concept/unstable/may-eat-your-data stage. Having said that, if anyone has a practical use for this I'd be very interested to hear about it - my main motivation for creating this FDW is to learn more about PostgreSQL internals. more... [Less]
Posted over 10 years ago
I would like to share the pgpool configuration, and it's failback mechanism in this post.Hope it will be helpful to you in creating pgpool and it's failback setup.Pgpool Installation & Configuration1. Download the pgpool from below link(Latest ... [More] version is 3.2.1).    http://www.pgpool.net/mediawiki/index.php/Downloads
2. Untart the pgpool-II-3.2.1.tar.gz and goto pgpool-II-3.2.1 directory.3. Install the pgpool by executing the below commands:  ./configure ­­prefix=/opt/PostgreSQL92/ ­­--with­-pgsql­-includedir=/opt/PostgreSQL92/include/--with­-pgsql­-libdir=/opt/PostgreSQL92/lib/makemake install4. You can see the pgpool files in /opt/PostgreSQL92/bin location./opt/PostgreSQL92/bin $ lsclusterdb   droplang  pcp_attach_node  pcp_proc_count pcp_systemdb_info  pg_controldata  pgpool pg_test_fsync pltcl_loadmod  reindexdb createdb    dropuser  pcp_detach_node  pcp_proc_info createlang  ecpg      pcp_node_count   pcp_promote_node oid2name  pcp_pool_status  pcp_stop_pgpool /opt/PostgreSQL92/bin $ ./pgpool ­versionpgpool­II version 3.2.1 (namameboshi)Pgpool Parameterslisten_addresses = '*'
# This parameter tells about, which clients can connect to the pgpool. If we mention the listen_addresses = 'ip1,ip2', then these two ip's only can access this pgpool. port = 9999
# We can define the pgpool port using this parameter. socket_dir = '/tmp'
# This parameter tells about the socket directory.pcp_port = 9898
# PCP is pgpool admin utility port. Using pcp port, we can execute some node attach and detaches. pcp_socket_dir = '/tmp' # PCP socket directory.backend_hostname0 = '10.176.112.188' # Node 0 Host IPbackend_port0 = 5432# Node 0 Portbackend_weight0 = 1
# Node 0 Weight in load balance mode.backend_data_directory0 = '/opt/PostgreSQL92/data' # Node 0 Data Direcotrybackend_flag0 = 'ALLOW_TO_FAILOVER' # Node 0 can allow failover or not.backend_hostname1 = '10.176.112.189'# Node 1 Host IPbackend_port1 = 5432# Node 1 Portbackend_weight1 = 1
# Node 1 Weight in load balance mode.backend_data_directory1 = '/opt/PostgreSQL92/data' # Node 1 Data Directorybackend_flag1 = 'ALLOW_TO_FAILOVER' # Node 1 can allow failover or not.enable_pool_hba = on
# Enabling pool hba authentication like pg_hba authentication. In any case, we need to have the same pg_hba.conf entries in pool_hba.conf.If we do have different values, then we some of the users can face connectivity issues. Hence, requesting you to make sure both values are same. If we enable the pool_hba, then we need to create "pool_passwd" file in the pgpool.conf paramter's location. pool_passwd must contain the values "username:md5password". You can get this username and passwords from "pg_shadow" postgres table.authentication_timeout = 60
# Pgpool client authentication timeout.num_init_children = 32
# This parameter tells about how many initial pg connections need to make while pgpool starts.max_pool = 3
# This parameter defines about how many pools for each num_init_children connections.pid_file_name = '/opt/PostgreSQL92/data/pgpool.pid'
# This parameter defines where the pgpool.pid file need to place.replication_mode = off
# This parameter need to off, if we are using pgpool on top of Slony/Streaming replication. We can also enable this parameter, if you want the pgpool replication rather than Slony/Streaming.load_balance_mode = off
# This parameter need to on, if we are doing the load balance. i.e, if you want to distribute the "SELECT" queries between the primary and slave servers, then we need to enable this parameter. By enabling this parameter, we can balance work load between the primary and slave.white_function_list = ''
# Readonly functions/procedures we can mention in this list. black_function_list = 'nextval,setval'
# Readwrite functions/procedures we need to mention in this list.recovery_user = 'postgres'
# We need to provide recovery user name.recovery_password = 'adminedb'
# We need to provide recovery user password. The above two parameters works only to the pcp commands like pcp_recovery_command, pcp_attach_node, pcp_detach_node.revery_1st_stage_command = 'basebackup'
# We need to provide recovery 1st command which is doing the failback.You can find more details of these parameters from the below link.http://www.pgpool.net/docs/latest/tutorial-en.htmlPgpool start and stop commands1. Start the pgpool using below command: ./pgpool ­f /opt/PostgreSQL92/etc/pgpool.conf -­F /opt/PostgreSQL92/etc/pcp.conf -a /opt/PostgreSQL92/etc/pool_hba.conf -­d -D -n >/opt/PostgreSQL92/data/pgpool.log 2>&1 &2. Check whether pgpool status: postgres~/bin> ps ­ef|grep pgpoolpostgres 4288 3754 0 07:55 pts/1 00:00:00 ./pgpool ­f/opt/PostgreSQL92/etc/pgpool.conf -F /opt/PostgreSQL92/etc/pcp.conf­a /opt/PostgreSQL92/etc/pool_hba.conf -d -­D -­npostgres 4289 4288 0 07:55 pts/1connection request3. Connect to database using pgpool port 9999. postgres@localhost:~/bin> psql ­p 9999Password:psql.bin (9.2.1)Type "help" for help.postgres=# show port; port­­­­­­ 5432(1 row)4. You can stop the pgpool by using below command: ./pgpool ­f /opt/PostgreSQL92/etc/pgpool.conf -F /opt/PostgreSQL92/etc/pcp.conf -a /opt/PostgreSQL92/etc/pool_hba.conf ­-m fast stop5. you can reload the pgpool by using below command: ./pgpool ­f /opt/PostgreSQL92/etc/pgpool.conf -F /opt/PostgreSQL92/etc/pcp.conf -a /opt/PostgreSQL92/etc/pool_hba.conf reloadPgpool FailbackFailback is one of the features of PGPOOL which will re-initiate the failed master as a new slave server to the new master. For doing this operation, we need to follow the below steps.1) Go to the Pgpool installer location on new master server. cd /tmp/pgpool­II­3.2.1/sql/pgpool­recoveryexport PATH=/opt/PostgreSQL92/bin:$PATHmakemake install2) Then execute the pgpool-recovery.sql file in "template1" database of new master server. 3) Then prepare a script "basebackup" & "pgpool_remote_start" & "test1.sh" as like attached scripts and place in new master's data directory. 4) Then include the following parameters in pgpool.conf where the pgpool instance is running . Once the modifications done,then reload the pgpool using the following command. Reload Command­­­­­­­­­­­­­­/opt/PostgreSQL92/bin/pgpool -­f /opt/PostgreSQL92/etc/pgpool.conf -F /opt/PostgreSQL92/etc/pcp.conf -a /opt/PostgreSQL92/etc/pool_hba.conf reloadPgpool Parameters­­­­­­­­­­­­­­­­­recovery_user = 'postgres'recovery_password = 'adminedb'recovery_1st_stage_command = 'basebackup'**Note: recovery_user credentials must match with the pcp.conf credentials. 5) As a final step, we need to execute the pcp_recovery_command from either new master or new slave as shown below. /opt/PostgreSQL92/bin/pcp_recovery_node ­d 1 10.176.112.189 9898postgres adminedb 0DEBUG: send: tos="R", len=46DEBUG: recv: tos="r", len=21, data=AuthenticationOKDEBUG: send: tos="D", len=6DEBUG: recv: tos="c", len=20, data=CommandCompleteDEBUG: send: tos="X", len=4If we get the above kind of message from the recovery command, then our setup has been accomplished. pcp_recovery_node Explanation ­­­­­­­­­­­­­­­­­­­­­­­­­­­­­ ­d, ­debug : enable debug message (optional) timeout : connection timeout value in seconds. command exits on timeout hostname : pgpool­II hostname port# : PCP port number username : username for PCP authentication password : password for PCP authentication nodeID : ID of a node to recovertest1.sh#!/bin/shexport PATH=/opt/PostgreSQL92/bin:$PATHexport LD_LIBRARY_PATH=/opt/PostgreSQL92/lib:$LD_LIBRARY_PATHexport PGPASSWORD=adminedbmv /opt/PostgreSQL92/data1 /opt/PostgreSQL92/data_oldmv /opt/PostgreSQL92/backup /opt/PostgreSQL92/data1echo "restore_command = 'cp /var/arch_test/%f %p'">/opt/PostgreSQL92/data1/recovery.confecho "standby_mode = 'on'">>/opt/PostgreSQL92/data1/recovery.confecho "primary_conninfo = 'host=10.176.112.188 port=5433 user=replication password=replication application_name=Async_Rep'">>/opt/PostgreSQL92/data1/recovery.confrm -f /opt/PostgreSQL92/data1/_Promote_Me_Primary_Is_Down_5433echo "trigger_file = '/opt/PostgreSQL92/data1/_Promote_Me_Primary_Is_Down_5433'">>/opt/PostgreSQL92/data1/recovery.confbasebackup#!/bin/shexport PATH=/opt/PostgreSQL92/bin:$PATHexport LD_LIBRARY_PATH=/opt/PostgreSQL92/lib:$LD_LIBRARY_PATHexport PGPASSWORD=adminedbecho "archive_command = 'cp %p /var/arch_test/%f && scp %p [email protected]:/var/arch_test/%f'" >> /opt/PostgreSQL92/data1/postgresql.confpg_ctl reloadpsql -c "select pg_start_backup('pgpool_recovery')" -p 5433ssh [email protected] mkdir /opt/PostgreSQL92/backuprsync -C -a --exclude 'postmaster.pid' --exclude 'postmaster.opts' --exclude 'recovery.done' /opt/PostgreSQL92/data1/ [email protected]:/opt/PostgreSQL92/backup/export PGPASSWORD=adminedbpsql -c "select pg_stop_backup()" -p 5433rm -f /opt/PostgreSQL92/data/recovery.donessh [email protected] sh /opt/PostgreSQL92/test1.shpgpool_remote_start#/bin/bashssh [email protected] pg_ctl -D /opt/PostgreSQL92/data1 startదినేష్ కుమార్Dinesh Kumar [Less]
Posted over 10 years ago
We’re having a lab day! When: Sat Jan 25, 10am-2pm (lunch provided) Where: Emma, B-side 6 Building, 5th floor Come experiment with streaming rep in a friendly environment with a delicious lunch and beverages, instead of in a crisis environment with ... [More] your boss breathing down your neck. Agenda, as much as we ever have one: - Discussion of the various configurations (streaming rep, hot standby, etc) - Actual setup - Is it working? - Discussion of results - If time allows, we’ll come up with creative ways to break it and look at what’s required to recover. As for previous labs, we have limited space. Please sign up here. Please come prepared with a laptop with two VMs, both with the latest version of Postgres installed. If you don’t have a laptop, we can set up VMs for you on the community equipment – please indicate that when you sign up, because we need some lead time to do that. If you have any tips, tricks, questions, or horror stories, please bring those, too! Lunch provided by EnterpriseDB. Space provided by Emma. [Less]
Posted over 10 years ago
Recently I wanted to find something to rerun a workload (from logfiles) in postgres. the reasons ? benchmark , testing , troubleshooting.My colleague Denish reminded me a project called pgreplay. pgreplay reads a PostgreSQL logfile , extracts the SQL ... [More] statements and executes them in the same order and with (or without) the original timing against a PG database.I download it, compiled it and went straight to the testing.The setup :log_min_messages = error  (or more)   (if you know that you have no cancel requests, 'log' will do)log_min_error_statement = log  (or more)log_connections = onlog_disconnections = onlog_line_prefix = '%m|%u|%d|%c|'  (if you don't use CSV logging)log_statement = 'all'lc_messages must be set to English (the encoding does not matter)bytea_output = escape  (from version 9.0 on, only if you want to replay                        the log on 8.4 or earlier)It is highly recommended that you use CSV logging, because anything thatthe PostgreSQL server or any loaded modules write to standard error willbe written to the stderr log and might confuse the parser.For small and simple workloads it seemed that it worked fine. Bellow the output from a simple workload :Replay statistics================= Speed factor for replay: 1.000Total run time: 29.700 secondsMaximum lag behind schedule: 0 secondsCalls to the server: 8(0.269 calls per second)Total number of connections: 1Maximum number of concurrent connections: 1Average number of concurrent connections: 1.000Average session idle percentage: 99.680%SQL statements executed: 6(0 or 0.000% of these completed with error)Maximum number of concurrent SQL statements: 1Average number of concurrent SQL statements: 0.003Average SQL statement duration: 0.016 secondsMaximum SQL statement duration: 0.038 secondsStatement duration histogram: 0 to 0.02 seconds: 83.333% 0.02 to 0.1 seconds: 16.667% 0.1 to 0.5 seconds: 0.000% 0.5 to 2 seconds: 0.000% over 2 seconds: 0.000%So far so good, i wanted to see what will happen when concurrency occurred so i tried with pgbench , with 2 concurrent everything was ok, so i tried to scale it up a bit to 16 concurrent transactions, and then these nice messages started showing :"Execution is 30 minutes behind schedule"The reason ? Locks Lots and lots of Locks ! but let's see closer what happened... While pgreplay was running trying to handle my logfile the best way it could, a simple lock investigating query was enough to show me that i would never get the successful prompt that i wanted :monkey=#   SELECT bl.pid     AS blocked_pid,monkey-#          a.usename  AS blocked_user,monkey-#          kl.pid     AS blocking_pid,monkey-#          ka.usename AS blocking_user,monkey-#          a.query    AS blocked_statementmonkey-#    FROM  pg_catalog.pg_locks         blmonkey-#     JOIN pg_catalog.pg_stat_activity a  ON a.pid = bl.pidmonkey-#     JOIN pg_catalog.pg_locks         kl ON kl.transactionid = bl.transactionid AND kl.pid != bl.pidmonkey-#     JOIN pg_catalog.pg_stat_activity ka ON ka.pid = kl.pidmonkey-#    WHERE NOT bl.granted; blocked_pid | blocked_user | blocking_pid | blocking_user |                          blocked_statement                           ------------- -------------- -------------- --------------- ----------------------------------------------------------------------        7398 | vasilis      |         7403 | vasilis       | UPDATE pgbench_branches SET bbalance = bbalance $1 WHERE bid = $2;        7398 | vasilis      |         7399 | vasilis       | UPDATE pgbench_branches SET bbalance = bbalance $1 WHERE bid = $2;        7398 | vasilis      |         7404 | vasilis       | UPDATE pgbench_branches SET bbalance = bbalance $1 WHERE bid = $2;        7398 | vasilis      |         7402 | vasilis       | UPDATE pgbench_branches SET bbalance = bbalance $1 WHERE bid = $2;        7402 | vasilis      |         7398 | vasilis       | UPDATE pgbench_branches SET bbalance = bbalance $1 WHERE bid = $2;        7402 | vasilis      |         7403 | vasilis       | UPDATE pgbench_branches SET bbalance = bbalance $1 WHERE bid = $2;        7402 | vasilis      |         7399 | vasilis       | UPDATE pgbench_branches SET bbalance = bbalance $1 WHERE bid = $2;        7402 | vasilis      |         7404 | vasilis       | UPDATE pgbench_branches SET bbalance = bbalance $1 WHERE bid = $2;        7403 | vasilis      |         7398 | vasilis       | UPDATE pgbench_branches SET bbalance = bbalance $1 WHERE bid = $2;        7403 | vasilis      |         7399 | vasilis       | UPDATE pgbench_branches SET bbalance = bbalance $1 WHERE bid = $2;        7403 | vasilis      |         7404 | vasilis       | UPDATE pgbench_branches SET bbalance = bbalance $1 WHERE bid = $2;        7403 | vasilis      |         7402 | vasilis       | UPDATE pgbench_branches SET bbalance = bbalance $1 WHERE bid = $2;        7404 | vasilis      |         7398 | vasilis       | UPDATE pgbench_branches SET bbalance = bbalance $1 WHERE bid = $2;        7404 | vasilis      |         7403 | vasilis       | UPDATE pgbench_branches SET bbalance = bbalance $1 WHERE bid = $2;        7404 | vasilis      |         7399 | vasilis       | UPDATE pgbench_branches SET bbalance = bbalance $1 WHERE bid = $2;        7404 | vasilis      |         7402 | vasilis       | UPDATE pgbench_branches SET bbalance = bbalance $1 WHERE bid = $2;        7409 | vasilis      |         7404 | vasilis       | UPDATE pgbench_tellers SET tbalance = tbalance $1 WHERE tid = $2;Deadlock, the reason ?(quote from pgreplay documentation) :While pgreplay makes sure that commands are sent to the server in the order in which they were originally executed, there is no way to guarantee that they will be executed in the same order during replay: Network delay, processor contention and other factors may cause a later command to "overtake" an earlier one. While this does not matter if the commands don't affect each other, it can lead to SQL statements hitting locks unexpectedly, causing replay to deadlock and "hang". This is particularly likely if many different sessions change the same data repeatedly in short intervals.And that was my case, lots of processes were changing the same data.For the record, i also set up a simple troubleshooting project that works on postgres, with 2 users doing their own thing pgreplay worked fine.conclusion :Ok , pgreplay didn't do the job for me, i am pretty sure that if i got the logs from a production DB i would run into deadlocks, still messing a bit with it was good because i could use this for replaying simple workload logfiles to a different machine or just because its a cool project that does what it does and you never know when you might need something like this.. [Less]