PLSQL for Curve Smoothing  

May 16, 2009 by abdulyadi

In an engineering project, I have been assigned to a task constructing smooth curve for given sets of sample points. This is accomplished by implementing Cubic Splines and Thomas algorithm to solve linear algebraic equation. So, I write PLSQL stored procedure as follows:

CREATE OR REPLACE FUNCTION spline(
IN x_arr numeric[], IN y_arr numeric[], IN resolution numeric,
OUT xs numeric, OUT ys numeric)
RETURNS SETOF record AS
$BODY$
DECLARE
  x numeric[];
  y numeric[];
  e numeric[] DEFAULT ‘{}’::numeric[];
  f numeric[] DEFAULT ‘{}’::numeric[];
  g numeric[] DEFAULT ‘{}’::numeric[];
  r numeric[] DEFAULT ‘{}’::numeric[];
  f2 numeric[] DEFAULT ‘{}’::numeric[];
  count integer;
  i integer;
  j integer;
  idx integer;
  tmp numeric;
BEGIN
  x:=x_arr;
  y:=y_arr;
  count:=ARRAY_UPPER(x,1);
  IF count<3 THEN
    RAISE EXCEPTION ’sample size must be >=3′;
  END IF;
  IF ARRAY_UPPER(y,1)<>count THEN
    RAISE EXCEPTION ‘x and y array size must be identical’;
  END IF;

——–sort x, ascendingly:
  FOR i IN 1..count-1 LOOP
    idx:=i+1;
    FOR j IN i+2..count LOOP
      IF x[j]<x[idx] THEN
        idx:=j;
      END IF;
    END LOOP;
    IF x[idx]=x[i] THEN
      RAISE EXCEPTION ‘x values must be unique’;
    ELSIF x[idx]<x[i] THEN –swap
      tmp=x[i];
      x[i]=x[idx];
      x[idx]=tmp;
      tmp=y[i];
      y[i]=y[idx];
      y[idx]=tmp;
    END IF;
  END LOOP;

——–prepare variables for Thomas algorithm
  FOR i IN 1..count LOOP
    IF i=1 OR i=count THEN
      e:= e || 0.0;
      f:= f || 0.0;
      g:= g || 0.0;
      r:= r || 0.0;
      f2:=f2 || 0.0;
      CONTINUE;
    END IF;
    e:= e || x[i] – x[i-1];
    f:= f || 2*(x[i+1]-x[i-1]);
    g:= g || x[i+1] – x[i];
    r:= r || ( (6/(x[i+1]-x[i])) * (y[i+1]-y[i]) )
      + ( (6/(x[i]-x[i-1])) * (y[i-1]-y[i]) );
    f2:=f2 || 0.0;
  END LOOP;

——–decomposition
  FOR i IN 2..count LOOP
    IF f[i-1]<>0 THEN
      e[i]:=e[i]/f[i-1];
    END IF;
    f[i]:=f[i]-(e[i]*g[i-1]);
  END LOOP;

——–forward substitution
  FOR i IN 2..count LOOP
    r[i]:=r[i]-(e[i]*r[i-1]);
  END LOOP;

——–backward substitution
  FOR i IN REVERSE count-1..1 LOOP
    IF f[i]<>0 THEN
      f2[i]:=(r[i] – (g[i]*f2[i+1])) / f[i];
    END IF;
  END LOOP;

——–do interpolation
  FOR i IN 2..count LOOP
    xs:=x[i-1]-resolution;
    LOOP
      xs:=xs+resolution;
      EXIT WHEN i=count AND xs>x[i];
      EXIT WHEN i=x[i];
      ys:= ( (f2[i-1] * (x[i]-xs)^3)
        + (f2[i] * (xs-x[i-1])^3)
        ) / (6*(x[i]-x[i-1]));
      ys:= ys + (
        (y[i-1]/(x[i]-x[i-1])) -
        (f2[i-1]*(x[i]-x[i-1])/6)
        ) * (x[i]-xs);
      ys:= ys + (
        (y[i]/(x[i]-x[i-1])) -
        (f2[i]*(x[i]-x[i-1])/6)
        ) * (xs-x[i-1]);
      RETURN NEXT;
    END LOOP;
  END LOOP;

  RETURN;
END;
$BODY$
LANGUAGE ‘plpgsql’ IMMUTABLE STRICT;

The stored procedure takes 3 input arguments and 2 outs. x_arr and y_arr are sample points’ x and y. With resolution, you can determine the resolution of smoothed curve’s x.

Output arguments: xs and ys contain the set of x and y of smoothed curve, respectively.

Let us use the function. At first, we have sample points:

# x y
1 3.3 1.2
2 4.2 -1.1
3 6.8 1.5
4 8.2 1.2
5 10.2 2.5

Call the function, supply x and y array as displayed on above table, set resolution to 0.1:

SELECT * FROM spline(
  ’{3.3,4.2,6.8,8.2,10.2}’::numeric[],
   ‘{1.2,-1.1,1.5,1.2,2.5}’::numeric[],
  0.1);

We will have set of records containing xs and ys.:

xs  ys
3.3  1.199999999999999970
3.4  0.8860175045157231811
3.5  0.5764170295261004851
3.6  0.2755805955257859750
3.7  -0.0121097769905662565
3.8  -0.28227206752830211620
… and so on

Let us put them in graph:
curve

The blue-y1 and red-y2 are sample and interpolation points as returned by the spline function, respectively.

Tree-structured data and Nested Hash

February 6, 2009 by abdulyadi

I have just accomplished a task to transfer tree-formed data structure from client application environment into PostgreSQL PLSQL body function. The tree at client side is structured as shown below:

  |–Manager1
  |    |
  |    |–Senior1
  |    |    |–Junior1
  |    |    |–Junior2
  |    |
  |    |–Senior2
  |         |–Junior3
  |         |–Junior4
  |  
  |–Manager2
      |
      |–Senior3
      |    |–Junior5
      |    |–Junior6
      |
      |–Senior4
           |–Junior7
           |–Junior8

Instead of send the item pieces one by one, I want to send all the items at once.

PLSQL Function (convert array of ltree into nested hash)

I decide to package the tree items in the array of LTREE field type provided in PostgreSQL contrib module by Teodor Sigaev and Oleg Bartunov. Later in the function body, I convert the LTREE array into HSTORE (again thanks to Teodor Sigaev and Oleg Bartunov). HSTORE maps key to value. Both are string types. But in my case, the hash contains key which point to another hash, creating nested hash.

Following is the stored procedure in PLSQL created on Postgresql 8.3.1:

CREATE OR REPLACE FUNCTION organization(IN trees ltree[]) RETURNS void AS
$BODY$
DECLARE
——–ltree[] will be converted to nested hash:
————-hash_mgr=>hash_snr=>array(Juniors)
    i integer;
    count integer;
    hash_mgr hstore;
    hash_snr hstore;
    tree ltree;
    arr text[];
    mgr text;
    snr text;
    jnr text;
    k1 text; v1 text;
    k2 text; v2 text;
BEGIN
    count:=ARRAY_UPPER(trees,1);
——–convert array of trees into nested hash:
    FOR i IN 1..count LOOP
        tree:=trees[i];
        mgr:=SUBLTREE(tree,0,1);
        snr:=SUBLTREE(tree,1,2);
        jnr:=SUBLTREE(tree,2,3);
        hash_snr:=(hash_mgr->mgr)::hstore;
        arr:=(hash_snr->snr)::text[];
        arr:=COALESCE(arr,’{}’::text[]) || jnr;
        hash_snr:=COALESCE( hash_snr||(snr=>arr::text),snr=>arr::text);
        hash_mgr:=COALESCE( hash_mgr||(mgr=>hash_snr::text),mgr=>hash_snr::text);
    END LOOP;
——–now we hava nested hash
——–usage 1: get Junior1 and Junior2:
    arr:=((hash_mgr->’Manager1′)::hstore)->’Senior1′;
    RAISE NOTICE ‘% %’,arr[1],arr[2];
——–navigate all items
    FOR k1, v1 IN SELECT key, value FROM EACH(hash_mgr) LOOP
        RAISE NOTICE ‘manager: %’,k1;
        FOR k2, v2 IN SELECT key, value FROM EACH(v1::hstore) LOOP
            RAISE NOTICE ‘ senior: %’,k2;
            RAISE NOTICE ‘ junior: %’,(v2::text[])[1];
            RAISE NOTICE ‘ junior: %’,(v2::text[])[2];
        END LOOP;
    END LOOP;
    RETURN;
END;$BODY$
LANGUAGE ‘plpgsql’ STABLE;

Let’s start from lowest level. Hash hash_snr maps “senior” to array of juniors.
Go up one level, hash hash_mgr maps “manager” to hash hash_snr.

To use the function, client application should invoke it as follows:

SELECT organization(‘{
    Manager1.Senior1.Junior1,
    Manager1.Senior1.Junior2,
    Manager1.Senior2.Junior3,
    Manager1.Senior2.Junior4,
        Manager2.Senior3.Junior5,
        Manager2.Senior3.Junior6,
        Manager2.Senior4.Junior7,
        Manager2.Senior4.Junior8
    }’);

Take note RAISE NOTICE at the lower portion of function body. It will give us messages as follows:

NOTICE:    Junior1 Junior2
NOTICE:    manager: Manager1
NOTICE:        senior: Senior1
NOTICE:                junior: Junior1
NOTICE:                junior: Junior2
NOTICE:        senior: Senior2
NOTICE:                junior: Junior3
NOTICE:                junior: Junior4
NOTICE:    manager: Manager2
NOTICE:        senior: Senior3
NOTICE:                junior: Junior5
NOTICE:                junior: Junior6
NOTICE:        senior: Senior4
NOTICE:                junior: Junior7
NOTICE:                junior: Junior8

Prevent Concurrent Update in Read-Committed Transaction

December 4, 2008 by abdulyadi

To prevent concurrent update on table records, I simply just encapsulate upates inside Serializable Transaction. But, quoted from Annotated postgresql.conf and Global User Configuration (GUC) Guide (http://www.varlena.com/GeneralBits/Tidbits/annotated_conf_e.html):

Under a heavy multi-user load, setting “serializable” can impose a significant penalty as numerous transactions are forced to wait for the serialized transaction to complete.

I want to share with you my simple way to implement the feature in PostgreSQL default transaction isolation level: READ COMMITTED.

First, define the table:

CREATE TABLE inventory(
  id serial NOT NULL,
  item text NOT NULL,
  qty integer NOT NULL,
  ver integer NOT NULL DEFAULT 1,
  CONSTRAINT inventory_pkey PRIMARY KEY (id)
)
WITH (OIDS=FALSE);

The last field ver is the focal point of the concurrent update protection.

Let’s create trigger function in PLSQL to be activated BEFORE INSERT and BEFORE UPDATE.

CREATE OR REPLACE FUNCTION trig_inventory()
RETURNS trigger AS
$BODY$
BEGIN
  IF (TG_WHEN = ‘BEFORE’ AND TG_OP = ‘INSERT’) THEN
    NEW.ver = 1; –force ver initiated with 1
    RETURN NEW;
  ELSIF (TG_WHEN = ‘BEFORE’ AND TG_OP = ‘UPDATE’) THEN
    IF OLD.ver > NEW.ver THEN –somebody has updated this record
      RAISE EXCEPTION ‘Concurrent Update Protection!’;
    END IF;
    NEW.ver = OLD.ver + 1; –increment UPDATE
    RETURN NEW;
  END IF;
END;
$BODY$
LANGUAGE ‘plpgsql’ VOLATILE;

Assign the trigger function to inventory table:

CREATE TRIGGER inventory_before_insert BEFORE INSERT
  ON inventory FOR EACH ROW EXECUTE PROCEDURE public.trig_inventory();
CREATE TRIGGER inventory_before_update BEFORE UPDATE
  ON inventory FOR EACH ROW EXECUTE PROCEDURE public.trig_inventory();

Now, let us try the feature (use psql):
First, insert record:

citra=# INSERT INTO inventory (item,qty) VALUES (‘ITEM_1′,50) RETURNING *;
id | item | qty | ver
—-+——–+—–+—–
2 | ITEM_1 | 50 | 1
(1 row)INSERT 0 1

Please note that field ver has been initialized to 1;
Now, update the record to check ver incremented:

citra=# BEGIN;
BEGIN
citra=# UPDATE inventory SET qty=qty+100 WHERE id=2 RETURNING *;
id | item | qty | ver
—-+——–+—–+—–
2 | ITEM_1 | 150 | 2
(1 row)UPDATE 1
citra=# COMMIT;

Finally, let’s simulate concurrent update protection. Besides current psql session, please open another session.
For simplicity, I assume that users on both session has queried the record and identify the ver equals 2. Please ensure that the ver field is included in update command.

session #1   session #2
citra=# BEGIN;   citra=# BEGIN;
citra=# UPDATE inventory SET qty=qty+100,ver=2 WHERE id=2 RETURNING *;   citra=# UPDATE inventory SET qty=qty+100,ver=2 WHERE id=2 RETURNING *;
citra=# COMMIT;   citra=# COMMIT;
    ERROR: Concurrent Update Protection!

The second session get the error message. The user must re-query the record to get the latest ver and use it in next update command.

Cross-tab Report

November 27, 2008 by abdulyadi

Another case with aggregate function. A sales table is defined and contains raw records of sales activity as follows:

CREATE TABLE sales
(
  id serial NOT NULL,
  item text NOT NULL,
  sales_unit text NOT NULL,
  amount numeric NOT NULL,
  CONSTRAINT sales_pkey PRIMARY KEY (id)
)
WITH (OIDS=FALSE);

id item    sales_unit amount
1  ITEM_1  UNIT_1     1000.5
2  ITEM_1  UNIT_2     1100
3  ITEM_1  UNIT_3     1150.25
4  ITEM_1  UNIT_1     100.75
5  ITEM_2  UNIT_2     275
6  ITEM_2  UNIT_3     750.5
7  ITEM_2  UNIT_2     300.5

Total sales amount for each sales unit for each item will be presented in cross-tab format:

item    UNIT_1   UNIT_2   UNIT_3
ITEM_1  1101.25  1100     1150.25
ITEM_2  0        575.5    750.5

To create aggregate function, let us prepare the State Transition Function:

CREATE OR REPLACE FUNCTION sales_amount(
  amounts numeric[],
  unit text,
  amount numeric,
  units text,
  unitcount integer)
RETURNS numeric[] AS
$BODY$
DECLARE
  out_amounts numeric[];
  unit_pattern text;
  sub_unit text[];
  idx integer;
BEGIN
  out_amounts:=amounts;
  ——–units in format: [unit_name]:[array_index],[unit_name]:[array_index],…
  unit_pattern:= unit || ‘:[0-9]{1,}’;
  sub_unit:=regexp_matches(units,unit_pattern);
  IF sub_unit IS NULL THEN
    RETURN out_amounts;
  END IF;
  ——–get array_index for dept
  idx:=split_part(array_to_string(sub_unit,”),’:',2)::integer;
  IF array_upper(out_amounts,1) IS NULL THEN
  ——–initiate amount array with 0
    out_amounts:=string_to_array(
    ’0′ || repeat( ‘,0′, unitcount-1 ) ,
    ’,’ )::numeric[];
  END IF;
  out_amounts[idx]:=out_amounts[idx]+amount;
  RETURN out_amounts;
END;$BODY$
LANGUAGE ‘plpgsql’ IMMUTABLE STRICT;

Here is the aggregate function:

CREATE AGGREGATE sales_agg(text, numeric, text, integer) (
  SFUNC=sales_amount,
  STYPE=numeric[],
  INITCOND=’{}’
);

Now, let us create user defined function:

CREATE OR REPLACE FUNCTION create_sales_report(
  refcursor, –cursor name
  units text[] –array of your expected units to appear on corss-tab report
)
RETURNS void AS
$BODY$
DECLARE
  unit_count integer;
  units_series text;
  i integer;
  sql text;
BEGIN
  unit_count:=array_upper(units,1);
  ——–convert units into [dept_name]:[array_index],[dept_name]:[array_index],…
  ——–to avoid repetitive loops inside aggregate function
  units_series:=”;
  FOR i in 1..unit_count LOOP
    IF i>1 THEN
      units_series:= units_series || ‘,’;
    END IF;
  units_series:= units_series || units[i] || ‘:’ || i;
  END LOOP;
  ——–compose sql statement
  sql:=    ‘SELECT t.item’;
  i:=1;
  WHILE units[i] IS NOT NULL LOOP
    sql:= sql || ‘,t.amounts[' || i || '] AS ‘ || quote_ident(units[i]);
    i:=i+1;
  END LOOP;
  sql:=sql || ‘ FROM (SELECT item,’
    || ’sales_agg(sales_unit,amount,’
    || quote_literal(units_series) || ‘,’
    || unit_count || ‘) AS amounts ‘
  || ‘FROM sales GROUP BY item’
  || ‘) t ORDER BY t.item’;
  ——–open cursor
  OPEN $1 FOR EXECUTE sql;
RETURN;
END;$BODY$
LANGUAGE ‘plpgsql’ IMMUTABLE STRICT;

Following is how the function is called in psql console:

citra=# begin;
BEGIN
citra=# select create_sales_report(‘report’,'{UNIT_1,UNIT_2,UNIT_3}’);
create_sales_report
———————

(1 row)

citra=# FETCH ALL IN report;
item  | UNIT_1  | UNIT_2 | UNIT_3
——–+———+——–+———
ITEM_1 | 1101.25 |   1100 | 1150.25
ITEM_2 |       0 |  575.5 |   750.5
(2 rows)

citra=# COMMIT;

The query can be expanded to have more sales units:

#select create_sales_report(‘report’,'{UNIT_1,UNIT_2,UNIT_3,…,…}’);

Or, to display all available sales units in the sales table, just use array_agg (see my previous post):

#select create_sales_report(‘report’,
  (SELECT array_agg(t.sales_unit) FROM (SELECT DISTINCT ON (sales_unit) sales_unit FROM sales) t)
);

array_agg for PostgreSQL version < 8.4

November 24, 2008 by abdulyadi

In relation with article by Hubert Lubaczewski ‘Waiting for 8.4 – array aggregate and array unpacker’, I have the need to use the former feature for one of my project.

I have my aggregate function as follows:

CREATE AGGREGATE array_agg(anyelement) (
SFUNC=array_append,
STYPE=anyarray,
INITCOND=’{}’
);

The sample of table definition looks like:

CREATE TABLE tes
(
id integer NOT NULL,
grp integer NOT NULL,
label text NOT NULL,
CONSTRAINT tes_pkey PRIMARY KEY (id)
) WITH (OIDS=FALSE);

Populate table so it contains records:

id grp     label
1      1       ABC
2      1      DEF
3      2      GHI
4      2      JKL

Run query:
SELECT grp, array_agg(label) FROM tes GROUP BY grp;
Result:
grp array_agg
1 {ABC,DEF}
2 {GHI,JKL}

The array at the second column then -of course- can be converted to string:
SELECT grp,array_to_string( array_agg(label), ‘,’ ) FROM tes GROUP BY grp;
Result:
grp array_to_string
1 ABC,DEF
2 GHI,JKL