Thanks to Bucardo team for responding my previous post. My cascaded slave replication works as expected.

Today I notice there is still something to do related with delta and track tables.
Single table replication scenario:
Db-A/Tbl-T1 (master) => Db-B/Tbl-T2 (slave) => Db-C/Tbl-T3 (cascaded slave)

Every change on Table T1 replicated to T2, then T2 to T3. After a while, VAC successfully cleans delta and track tables on Db-A. But not on Db-B.

I detect 2 issues:
1. If cascaded replication T2 to T3 successful, the delta table on Db-B is not be cleaned up by VAC.
2. If cascaded replication T2 to T3 failed before VAC schedule, the delta table on Db-B will be cleaned up by VAC. Then, cascaded replication from T2 to T3 losts.

I fix it by modifying SQL inside bucardo.bucardo_purge_delta(text, text) in Db-A and Db-B:

— Delete all txntimes from the delta table that:
— 1) Have been used by all dbgroups listed in bucardo_delta_targets
— 2) Have a matching txntime from the track table
— 3) Are older than the first argument interval
myst = 'DELETE FROM bucardo.'
|| deltatable
|| ' USING (SELECT track.txntime AS tt FROM bucardo.'
|| tracktable
|| ' track INNER JOIN bucardo.bucardo_delta_targets bdt ON track.target=bdt.target'
|| ' WHERE bdt.tablename::regclass::text='
|| quote_literal($2)

|| ' GROUP BY 1 HAVING COUNT(*) = '
|| drows
|| ') AS foo'
|| ' WHERE txntime = tt'
|| ' AND txntime < now() – interval '
|| quote_literal($1);

Need advice from Bucardo team.

When I read new release of Bucardo 5 with capability of asynchronous multi-master replication, I was eagered to wet my foot before swimming. Not really about multi-master features, just to implement simple master-slave-cascaded_slave replication, as I have done with Slony.

Scenario:
master table A => slave table B => cascaded slave table C.

So, I built Bucardo from source version 5.1.1 then configured databases, tables, syncs. I have turned on “makedelta” for slave table which is source for cascaded replication to another slave table. Master to slave (A to B) replication run well. But, unfortunately, the cascaded replication (B to C) does not work.

Then I dig into Bucardo.pm source code. In sub start_kid, the “does_makedelta hash” populated as follows:

sub start_kid {


for my $dbname (@dbs_source) {

$x = $sync->{db}{$dbname};

for my $g (@$goatlist) {


## Set the per database/per table makedelta setting now
if (defined $g->{makedelta}) {

if ($g->{makedelta} eq ‘on’ or $g->{makedelta} =~ /\b$dbname\b/) {

$x->{does_makedelta}{$S}{$T} = 1;

}

}

}

}

}

I suspect that “does_makedelta hash” reference was incorrectly taken from dbs_source. To my opinion, it should be taken from dbs_target. So, I move “does_makedelta hash” to another new dbs_target loop:

sub start_kid {


for my $dbname (@dbs_source) {

$x = $sync->{db}{$dbname};

for my $g (@$goatlist) {


## Set the per database/per table makedelta setting now
## if (defined $g->{makedelta}) {

## if ($g->{makedelta} eq ‘on’ or $g->{makedelta} =~ /\b$dbname\b/) {

## $x->{does_makedelta}{$S}{$T} = 1;
## …

## }

## }

}

}

for my $dbname (@dbs_target) {

$x = $sync->{db}{$dbname};
for my $g (@$goatlist) {

next if $g->{reltype} ne ‘table';
($S,$T) = ($g->{safeschema},$g->{safetable});
## Set the per database/per table makedelta setting now
if (defined $g->{makedelta}) {

if ($g->{makedelta} eq ‘on’ or $g->{makedelta} =~ /\b$dbname\b/) {

$x->{does_makedelta}{$S}{$T} = 1;
$self->glog(“Set table $dbname.$S.$T to makedelta”, LOG_NORMAL);

}

}

}

}

}

I have not verified whether it was valid. But, at least, my cascaded replication works.

To speedup my web application backed by PostgreSQL, query results is cached in Memcached. I have been using pgmemcache extension, so I can communicate with the distributed memory object caching system from within my stored procedure.

The postgresql extension code is based on libmemcached. And I have noticed that start from version 1.0.16, libmemcached has modified codes within memcached_stat_get_value function (defined in stats.cc). The resulted string is no longer allocated by customized allocator, but using malloc instead. If pgmemcache version>= 2.0.1 (2.1.2 in my case) is built against the libmemcached version, postgresql will crash once you invoke:
abduldb=# select memcache_stats();

The fix is simply modify memory deallocation from pfree() to free() for string after memcached_stat_get_value in Datum memcache_stats(PG_FUNCTION_ARGS) or server_stat_function(…) (defined in pgmemcache.c):

char *value = memcached_stat_get_value(ptr, &stat, *stat_ptr, &rc);
appendStringInfo(context, "%s: %s\n", *stat_ptr, value);
//pfree(value);
free(value);

For me, pltoolbox is a great extension by Pavel Stehule. I have downloaded and built it on PostgreSQL 9.3.0 and come out with warnings:

warning: implicit declaration of function ‘HeapTupleHeaderGetTypeId’
warning: implicit declaration of function ‘HeapTupleHeaderGetTypMod’
warning: implicit declaration of function ‘HeapTupleHeaderGetDatumLength’

If I just simply ignore those warnings then SQL installation failed:

psql -f pltoolbox.sql -U postgres -h localhost -p 5432 -d targetdb
ERROR: could not load library “/opt/pgsql/9.3.0/lib/pltoolbox.so”: /opt/pgsql/9.3.0/lib/pltoolbox.so: undefined symbol: HeapTupleHeaderGetTypeId

The fix is just simply put include statement in “record.c”:

#include “access/htup_details.h”

After then, the package building and installation went smoothly.

November 2013 is a great month for Datatrans Informatika and Systran Electronics Zone. “Smartcard for Frequent Traveler” (SFFT), a PostgreSQL based project presented by the two companies (located at Batam, Indonesia) as a team has won Gold Award in ASEAN ICT Award 2013 under Public Sector Category. (http://www.aseanictaward.com/index.php/about-aicta/aicta-2013-finalist). The ceremony was held on beautiful evening, 14th November 2013, at Gardens by The Bay, Marina Bay, Singapore.

foto1
foto2

SFFT is an ICT project to help frequent traveler entering and exiting Indonesian territory more effiecienty via 8 international seaports spreaded in Riau Islands province. Instead of wasting time in long queue for showing passport pages to officer at immigration check point, traveler can simply insert smartcard into reader at automatic verification gate.

SFFT relies on PostgreSQL database server installed at the 8 remote nodes at 3 different islands. The nodes are inter-replicated cascadely by means of Slony. Its asynchronousity helps replication works over unreliable network connection.

map1

SFFT is an ajax-based web application. Hardwares were driven through efficient Mozilla’s Js-ctypes. The technology enables communication back and forth between javascript codes and dynamic linked library written in C. At the backend, SFFT has Java Servlets deployed on Tomcat web server. Data exchanges with PostgreSQL database server is run through JDBC. PgPool was there to reduce connection overhead. Extension codes based on gSOAP library has been developed for data exchanges between PostgreSQL and immigration SOAP/XML web service. Overall system design is as follows:

system

I would like to thank PostgreSQL. It was really beautiful evening at the heart of south east asia.

PgDBF: convert FoxPro to PostgreSQL’s non-Public Schema
Recently, I have to migrate Foxpro tables into PostgreSQL. I have downloaded the latest version of PgDBF originally written by Kirk Strauser (http://sourceforge.net/projects/pgdbf/files/latest/download?source=files), and successfully built the package. It run well. All foxpro data can only be stored in public schema, PgDBF does not offer option for other schema.
For that purpose I have created a patch. Once the package rebuilt, I can issue command with “upper-cased S” option -S myschema (assuming I have already had the lowercase-typed schema in my PostgreSQL database):
pgdbf -S myschema myfox.dbf
— pgdbf.c 2013-01-06 10:18:41.268248051 +0700
+++ pgdbf_new.c 2013-01-06 11:14:17.766415632 +0700
@@ -31,7 +31,7 @@

#include “pgdbf.h”

-#define STANDARDOPTS “cCdDeEhm:nNpPqQtTuU”
+#define STANDARDOPTS “cCdDeEhm:nNpPqQtTuUS:”

int main(int argc, char **argv) {
/* Describing the DBF file */
@@ -121,6 +121,8 @@
char *optinputcharset = NULL;
#endif

+ char *optcustomschema= NULL;
+
strcpy(optvalidargs, STANDARDOPTS);
#if defined(HAVE_ICONV)
/* Note that the declaration for optvalidargs currently reserves exactly
@@ -193,6 +195,9 @@
case ‘U':
optusetruncatetable = 0;
break;
+ case ‘S':
+ optcustomschema= optarg;
+ break;
case ‘h':
default:
/* If we got here because someone requested ‘-h’, exit
@@ -211,9 +216,9 @@
if(optexitcode != -1) {
printf(
#if defined(HAVE_ICONV)
– “Usage: %s [-cCdDeEhtTuU] [-s encoding] [-m memofilename] filename [indexcolumn ...]\n”
+ “Usage: %s [-cCdDeEhtTuU] [-s encoding] [-S customschema] [-m memofilename] filename [indexcolumn ...]\n”
#else
– “Usage: %s [-cCdDeEhtTuU] [-m memofilename] filename [indexcolumn ...]\n”
+ “Usage: %s [-cCdDeEhtTuU] [-S customschema] [-m memofilename] filename [indexcolumn ...]\n”
#endif
“Convert the named XBase file into PostgreSQL format\n”
“\n”
@@ -234,6 +239,7 @@
#if defined(HAVE_ICONV)
” -s the encoding used in the file, to be converted to UTF-8\n”
#endif
+ ” -S the custom postgresql schema\n”
” -t wrap a transaction around the entire series of statements (default)\n”
” -T do not use an enclosing transaction\n”
” -u issue a ‘TRUNCATE’ command before inserting data\n”
@@ -288,7 +294,7 @@
* is used for other things, like creating the names of indexes. Despite
* its name, baretablename may be surrounded by quote marks if the “-q”
* option for optusequotedtablename is given. */
– baretablename = malloc(strlen(dbffilename) + 1 + optusequotedtablename * 2);
+ baretablename = malloc( (optcustomschema!=NULL ? (strlen(optcustomschema) + 1/*dot schema separator*/) : 0) + strlen(dbffilename) + 1 + optusequotedtablename * 2);
if(baretablename == NULL) {
exitwitherror(“Unable to allocate the bare tablename buffer”, 1);
}
@@ -304,6 +310,12 @@
/* Create tablename and baretablename at the same time. */
t = tablename;
u = baretablename;
+ if(optcustomschema!=NULL) {
+ for(i=0,j=strlen(optcustomschema);i<j;++i){
+ *u++=tolower(optcustomschema[i]);
+ }
+ *u++ = '.'; //dot schema separator
+ }
if(optusequotedtablename) *u++ = '"';
while(*s) {
if(*s == '.') {

Finding who’s birthday on a day or over a period of days is always interesting. Finding it from a computer database is even more interesting.

Here it is a table to start with (all works below is on PostgreSQL version 9.1.3):

CREATE TABLE person
(
id integer NOT NULL,
dob date NOT NULL,
CONSTRAINT “id must be unique” PRIMARY KEY (id)
)
WITH (OIDS=FALSE);

Fill it with 500,000 dummy records with random birthday:

INSERT INTO person
WITH RECURSIVE t AS (
SELECT 1 AS id, DATE_PLI(‘1970-01-01′::date, (RANDOM() * 5000)::integer ) AS dob
UNION ALL
SELECT t.id+1 AS id, DATE_PLI(‘1970-01-01′::date, (RANDOM() * 5000)::integer ) AS dob FROM t
WHERE t.id<500000
)
SELECT t.id, t.dob FROM t;

Now let us examine a query to find out who’s birthday on Dec 30th 2012:

EXPLAIN ANALYZE
SELECT * FROM person WHERE DATE_PART(‘MONTH’,dob)=DATE_PART(‘MONTH’,’2012-12-30′::date) AND DATE_PART(‘DAY’,dob)=DATE_PART(‘DAY’,’2012-12-30′::date);

Result:

Seq Scan on person (cost=0.00..14713.00 rows=12 width=8) (actual time=0.226..284.272 rows=1365 loops=1)
Filter: ((date_part(‘MONTH'::text, (dob)::timestamp without time zone) = 12::double precision) AND (date_part(‘DAY'::text, (dob)::timestamp without time zone) = 30::double precision))
Total runtime: 286.369 ms

Can we do it faster? with the help of index.

CREATE INDEX “person_idx_dob1″ ON person USING btree
( INT4PL( DATE_PART(‘MONTH’,dob)::integer*100, DATE_PART(‘DAY’,dob)::integer ) );

Let us examine a query in different way so the index works:

EXPLAIN ANALYZE
SELECT * FROM person
WHERE INT4PL(DATE_PART(‘MONTH’,dob)::integer*100,DATE_PART(‘DAY’,dob)::integer)=
INT4PL(DATE_PART(‘MONTH’,’2012-12-30′::date)::integer*100,DATE_PART(‘DAY’,’2012-12-30′::date)::integer);

Result:

Bitmap Heap Scan on person (cost=43.69..2443.18 rows=2500 width=8) (actual time=0.481..3.805 rows=1365 loops=1)
Recheck Cond: (int4pl(((date_part(‘MONTH'::text, (dob)::timestamp without time zone))::integer * 100), (date_part(‘DAY'::text, (dob)::timestamp without time zone))::integer) = 1230)
-> Bitmap Index Scan on person_idx_dob1 (cost=0.00..43.06 rows=2500 width=0) (actual time=0.305..0.305 rows=1365 loops=1)
Index Cond: (int4pl(((date_part(‘MONTH'::text, (dob)::timestamp without time zone))::integer * 100), (date_part(‘DAY'::text, (dob)::timestamp without time zone))::integer) = 1230)
Total runtime: 5.783 ms (previously 286.369 ms)

Yes, the index works and it is faster now.

More Challenges?

Finding who’s birthday over a period of days is more complex than what we have done on a single day. Specially when the period starts from a day close to end of year to a day next year.
I have a trick to transform a date-of-birth to the form of:

20000 + (month-part-of-a-date * 100) plus (day-part-of-a-date).

If the whole period is within a single year, then the start and stop date is transformed into:

20000 + (month-part-of-the-start-date * 100) plus (day-part-of-the-start-date)
20000 + (month-part-of-the-stop-date * 100) plus (day-part-of-the-stop-date)

But, if it crosses year border then it has lower and higher range detection. The lower range detection:

10000 + (month-part-of-the-start-date * 100) plus (day-part-of-the-start-date)
20000 + (month-part-of-the-stop-date * 100) plus (day-part-of-the-stop-date)

And for higher range detection:

20000 + (month-part-of-the-start-date * 100) plus (day-part-of-the-start-date)
30000 + (month-part-of-the-stop-date * 100) plus (day-part-of-the-stop-date)

Give it a try. To query who’s birthday between 1) Dec 30 2012 and Dec 31 2012, 2) Dec 30 2012 and Jan 01 2013

Period within single year: Dec 30 2012 and Dec 31 2012.

EXPLAIN ANALYZE
SELECT * FROM person
WHERE (
( INT4PL( INT4PL(20000,(DATE_PART(‘MONTH’,dob)::integer*100)), DATE_PART(‘DAY’,dob)::integer) )
BETWEEN
( INT4PL( INT4PL(20000,(DATE_PART(‘MONTH’,’2012-12-30′::date)::integer*100)),DATE_PART(‘DAY’,’2012-12-30′::date)::integer) )
AND
( INT4PL( INT4PL(20000,(DATE_PART(‘MONTH’,’2012-12-31′::date)::integer*100)),DATE_PART(‘DAY’,’2012-12-31′::date)::integer) )
);

Result:

Seq Scan on person (cost=0.00..32213.00 rows=2500 width=8) (actual time=0.504..611.400 rows=2715 loops=1)
Filter: ((int4pl(int4pl(20000, ((date_part(‘MONTH'::text, (dob)::timestamp without time zone))::integer * 100)), (date_part(‘DAY'::text, (dob)::timestamp without time zone))::integer) >= 21230) AND
(int4pl(int4pl(20000,((date_part(‘MONTH'::text, (dob)::timestamp without time zone))::integer * 100)), (date_part(‘DAY'::text, (dob)::timestamp without time zone))::integer) <= 21231))
Total runtime: 614.999 ms

Period crosses years: Dec 30 2012 and Jan 01 2013:

EXPLAIN ANALYZE
SELECT * FROM person
WHERE (
( INT4PL( INT4PL(20000,(DATE_PART(‘MONTH’,dob)::integer*100)), DATE_PART(‘DAY’,dob)::integer) )
BETWEEN
( INT4PL( INT4PL(10000,(DATE_PART(‘MONTH’,’2012-12-30′::date)::integer*100)),DATE_PART(‘DAY’,’2012-12-30′::date)::integer) )
AND
( INT4PL( INT4PL(20000,(DATE_PART(‘MONTH’,’2013-01-01′::date)::integer*100)),DATE_PART(‘DAY’,’2013-01-01′::date)::integer) )
)
OR
(
( INT4PL( INT4PL(20000,(DATE_PART(‘MONTH’,dob)::integer*100)), DATE_PART(‘DAY’,dob)::integer) )
BETWEEN
( INT4PL( INT4PL(20000,(DATE_PART(‘MONTH’,’2012-12-30′::date)::integer*100)),DATE_PART(‘DAY’,’2012-12-30′::date)::integer) )
AND
( INT4PL( INT4PL(30000,(DATE_PART(‘MONTH’,’2013-01-01′::date)::integer*100)),DATE_PART(‘DAY’,’2013-01-01′::date)::integer) )
);

Result:

Seq Scan on person (cost=0.00..57213.00 rows=4988 width=8) (actual time=1.232..1785.936 rows=4054 loops=1)
Filter: (((int4pl(int4pl(20000, ((date_part(‘MONTH'::text, (dob)::timestamp without time zone))::integer * 100)), (date_part(‘DAY'::text, (dob)::timestamp without time zone))::integer) >= 11230) AND
(int4pl(int4pl(20000, ((date_part(‘MONTH'::text, (dob)::timestamp without time zone))::integer * 100)), (date_part(‘DAY'::text, (dob)::timestamp without time zone))::integer) <= 20101)) OR
((int4pl(int4pl(20000, ((date_part(‘MONTH'::text, (dob)::timestamp without time zone))::integer * 100)), (date_part(‘DAY'::text, (dob)::timestamp without time zone))::integer) >= 21230) AND
(int4pl(int4pl(20000, ((date_part(‘MONTH'::text, (dob)::timestamp without time zone))::integer * 100)), (date_part(‘DAY'::text, (dob)::timestamp without time zone))::integer) <= 30101)))
Total runtime: 1791.009 ms

Can it be faster?

Create index:

CREATE INDEX “person_idx_dob2″ ON person USING btree
INT4PL( INT4PL(20000,(DATE_PART(‘MONTH’,dob)::integer*100)), DATE_PART(‘DAY’,dob)::integer) );

Re-run exactly the same query as above and see how the index works:

Period within single year: Dec 30 2012 and Dec 31 2012:

Result:

Bitmap Heap Scan on person (cost=49.96..2518.20 rows=2500 width=8) (actual time=0.829..6.969 rows=2715 loops=1)
Recheck Cond: ((int4pl(int4pl(20000, ((date_part(‘MONTH'::text, (dob)::timestamp without time zone))::integer * 100)), (date_part(‘DAY'::text, (dob)::timestamp without time zone))::integer) >= 21230) AND
(int4pl(int4pl(20000, ((date_part(‘MONTH'::text, (dob)::timestamp without time zone))::integer * 100)), (date_part(‘DAY'::text, (dob)::timestamp without time zone))::integer) <= 21231))
-> Bitmap Index Scan on person_idx_dob2 (cost=0.00..49.34 rows=2500 width=0) (actual time=0.549..0.549 rows=2715 loops=1)
Index Cond: ((int4pl(int4pl(20000, ((date_part(‘MONTH'::text, (dob)::timestamp without time zone))::integer * 100)), (date_part(‘DAY'::text, (dob)::timestamp without time zone))::integer) >= 21230) AND
(int4pl(int4pl(20000, ((date_part(‘MONTH'::text, (dob)::timestamp without time zone))::integer * 100)), (date_part(‘DAY'::text, (dob)::timestamp without time zone))::integer) <= 21231))
Total runtime: 10.649 ms (previously 614.999 ms)

Period crosses years: Dec 30 2012 and Jan 01 2013:

Result:

Bitmap Heap Scan on person (cost=101.17..2864.17 rows=4988 width=8) (actual time=1.552..9.987 rows=4054 loops=1)
Recheck Cond: (((int4pl(int4pl(20000, ((date_part(‘MONTH'::text, (dob)::timestamp without time zone))::integer * 100)), (date_part(‘DAY'::text, (dob)::timestamp without time zone))::integer) >= 11230) AND
(int4pl(int4pl(20000, ((date_part(‘MONTH'::text, (dob)::timestamp without time zone))::integer * 100)), (date_part(‘DAY'::text, (dob)::timestamp without time zone))::integer) <= 20101)) OR
((int4pl(int4pl(20000, ((date_part(‘MONTH'::text, (dob)::timestamp without time zone))::integer * 100)), (date_part(‘DAY'::text, (dob)::timestamp without time zone))::integer) >= 21230) AND

(int4pl(int4pl(20000, ((date_part(‘MONTH'::text, (dob)::timestamp without time zone))::integer * 100)), (date_part(‘DAY'::text, (dob)::timestamp without time zone))::integer) <= 30101)))
-> BitmapOr (cost=101.17..101.17 rows=5000 width=0) (actual time=1.210..1.210 rows=0 loops=1)
-> Bitmap Index Scan on person_idx_dob2 (cost=0.00..49.34 rows=2500 width=0) (actual time=0.495..0.495 rows=1339 loops=1)
Index Cond: ((int4pl(int4pl(20000, ((date_part(‘MONTH'::text, (dob)::timestamp without time zone))::integer * 100)), (date_part(‘DAY'::text, (dob)::timestamp without time zone))::integer) >= 11230) AND
(int4pl(int4pl(20000, ((date_part(‘MONTH'::text, (dob)::timestamp without time zone))::integer * 100)), (date_part(‘DAY'::text, (dob)::timestamp without time zone))::integer) <= 20101))
-> Bitmap Index Scan on person_idx_dob2 (cost=0.00..49.34 rows=2500 width=0) (actual time=0.710..0.710 rows=2715 loops=1)
Index Cond: ((int4pl(int4pl(20000, ((date_part(‘MONTH'::text, (dob)::timestamp without time zone))::integer * 100)), (date_part(‘DAY'::text, (dob)::timestamp without time zone))::integer) >= 21230) AND
(int4pl(int4pl(20000, ((date_part(‘MONTH'::text, (dob)::timestamp without time zone))::integer * 100)), (date_part(‘DAY'::text, (dob)::timestamp without time zone))::integer) <= 30101))
Total runtime: 15.224 ms (previously 1791.009 ms)

What about Feb 29th?

It must be filtered out if the year of period detection is not a leap year. And I do not have time to show you because I am in a hurry to say:

Happy Birthday!
Follow

Get every new post delivered to your Inbox.