This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Last revision Both sides next revision | ||
postgresql [2016/03/03 12:45] mstraub [PSQL meta-commands] |
postgresql [2018/08/27 13:41] mantis |
||
---|---|---|---|
Line 1: | Line 1: | ||
+ | ====== Installation & Setup ====== | ||
+ | |||
+ | For Ubuntu 12.04: | ||
+ | |||
+ | <code bash> | ||
+ | sudo apt-get install postgresql-9.1 pgadmin3 | ||
+ | </code> | ||
+ | |||
+ | pgAdmin III is a strongly recommended GUI for twiddling with your PostgreSQL. However, here are the first steps with the command line right after installing. | ||
+ | |||
+ | ===== Server Configuration ===== | ||
+ | |||
+ | ==== Enable remote TCP/IP connections ==== | ||
+ | |||
+ | Allow TCP/IP connections from all IPs (default = localhost only) by adding this line to ''/etc/postgresql/9.1/main/postgresql.conf'': | ||
+ | |||
+ | <code> | ||
+ | listen_addresses = '*' | ||
+ | </code> | ||
+ | |||
+ | Then allow password authentication for all users from all IPs by adding this line to ''/etc/postgresql/9.1/main/pg_hba.conf'': | ||
+ | |||
+ | <code> | ||
+ | host all all 0.0.0.0/0 md5 | ||
+ | </code> | ||
+ | |||
+ | |||
+ | To further twiddle with more restrictive client authentication settings have a look at [[http://www.postgresql.org/docs/9.1/static/auth-pg-hba-conf.html|pg_hba.conf]]. | ||
+ | |||
+ | |||
+ | ===== User Management ===== | ||
+ | |||
+ | First steps with a user of your choosing | ||
+ | |||
+ | <code bash> | ||
+ | # log in with user created by default (which has no password set) | ||
+ | sudo -u postgres psql | ||
+ | |||
+ | # create a user with your linux user name and log in | ||
+ | # (in Ubuntu the postgres user that matches your linux user name | ||
+ | # can by default log in without supplying a password) | ||
+ | sudo -u postgres createuser --superuser $USER | ||
+ | psql -d postgres | ||
+ | # set a password | ||
+ | \password | ||
+ | |||
+ | # create a restricted user with password (long and short version) | ||
+ | sudo -u postgres createuser --no-superuser --no-createdb --no-createrole --pwprompt username | ||
+ | sudo -u postgres createuser -SDRP username | ||
+ | psql -U username -d postgres | ||
+ | </code> | ||
+ | |||
+ | <code bash> | ||
+ | # drop a user | ||
+ | sudo -u postgres dropuser username | ||
+ | </code> | ||
+ | |||
+ | ==== Setting a password for user postgres ==== | ||
+ | <code sql> | ||
+ | \password postgres | ||
+ | </code> | ||
+ | |||
+ | |||
+ | ==== Create Readonly User ==== | ||
+ | **1. Create the user** | ||
+ | <code bash> | ||
+ | sudo -u postgres createuser --no-superuser --no-createdb --no-createrole --pwprompt username | ||
+ | </code> | ||
+ | |||
+ | **2. Grant readonly permissions** | ||
+ | |||
+ | This is the tedious part (in postgres versions pre 9.0). We need to grant the user the permission to | ||
+ | * connect to the DB | ||
+ | * use the schema | ||
+ | * perform a select on everay object in the schema | ||
+ | |||
+ | <code sql> | ||
+ | GRANT CONNECT ON DATABASE mydb TO username; | ||
+ | GRANT USAGE ON SCHEMA public TO username; | ||
+ | |||
+ | -- Generate GRANT SELECT statements for all tables/views in the schema | ||
+ | SELECT 'GRANT SELECT ON public.' || relname || ' TO username;' | ||
+ | FROM pg_class JOIN pg_namespace ON pg_namespace.oid = pg_class.relnamespace | ||
+ | WHERE nspname = 'public' AND relkind IN ('r', 'v'); | ||
+ | </code> | ||
+ | |||
+ | === Slightly different approach === | ||
+ | Creating a true read-only user is a real pain in postgres. Just revoking privileges from a user still allows him to create database objects which he owns. You need to revoke the CREATE and USAGE privileges from the SCHEMA public and the GROUP public. | ||
+ | |||
+ | <code sql> | ||
+ | |||
+ | -- Revoke default permissions from public group: | ||
+ | REVOKE CREATE ON SCHEMA public FROM PUBLIC; | ||
+ | REVOKE USAGE ON SCHEMA public FROM PUBLIC; | ||
+ | |||
+ | -- Add back permissions for the users allowed to create stuff | ||
+ | GRANT CREATE ON SCHEMA public TO rw_user; | ||
+ | GRANT USAGE ON SCHEMA public TO rw_user; | ||
+ | |||
+ | -- Add SELECT permissions for read-only user | ||
+ | GRANT USAGE ON SCHEMA public to readonly_user; | ||
+ | GRANT SELECT ON my_table to readonly_user; | ||
+ | |||
+ | </code> | ||
+ | |||
+ | see http://www.ruizs.org/archives/89 for details. | ||
+ | |||
+ | |||
+ | ==== Grant on all tables ==== | ||
+ | |||
+ | With a single [[http://www.postgresql.org/docs/current/static/sql-grant.html|GRANT command]] permission for all tables in a schema can be changed: | ||
+ | <code> | ||
+ | GRANT ALL ON ALL TABLES IN SCHEMA public TO user; | ||
+ | </code> | ||
+ | |||
+ | ==== Change owner on all tables ==== | ||
+ | |||
+ | CLI: Unfortunately this can only be achieved via scripting, but we don't have to reinvent the wheel: https://gist.github.com/jirutka/afa3ce62b1430abf7572 | ||
+ | |||
+ | GUI: Or in pgadminIII right-click on the schema, go to properties, and set the owner to your liking. | ||
+ | ===== PSQL meta-commands ===== | ||
+ | |||
+ | After you logged into PostgreSQL it's good to know some often-used meta-commands (read more in ''man psql''): | ||
+ | |||
+ | <code sql> | ||
+ | -- list all users | ||
+ | \du | ||
+ | |||
+ | -- list all schemas | ||
+ | \dn | ||
+ | |||
+ | -- list tables | ||
+ | \d | ||
+ | \d+ -- (also show comments) | ||
+ | \d tablename -- show details of a table | ||
+ | |||
+ | -- list all tables / tables of a certain schema | ||
+ | \dt *.* -- list all tables in DB | ||
+ | \dt schemaname.* -- list all tables in a certain schema | ||
+ | |||
+ | -- list all views | ||
+ | \dv *.* -- list all views in DB | ||
+ | |||
+ | -- list all databases on the host | ||
+ | \l | ||
+ | |||
+ | -- connect to a database | ||
+ | \c dbname | ||
+ | |||
+ | -- execute a command on the Linux shell (as-is) | ||
+ | \! command | ||
+ | |||
+ | -- quit | ||
+ | \q | ||
+ | </code> | ||
+ | |||
+ | For more meta-commands check out [[http://www.postgresql.org/docs/9.1/static/app-psql.html|the documentation]]. | ||
+ | |||
+ | |||
+ | ===== List and Disconnect Active Sessions ===== | ||
+ | |||
+ | <code sql> | ||
+ | -- list active connections (and get procpids) | ||
+ | SELECT * FROM pg_stat_activity; | ||
+ | |||
+ | -- cancel a process but leave the user's session open | ||
+ | SELECT pg_cancel_backend( <procpid> ); | ||
+ | |||
+ | -- terminate a user's query and session | ||
+ | SELECT pg_terminate_backend( <procpid> ); | ||
+ | </code> | ||
+ | ====== DB Management ====== | ||
+ | ===== Show DB Sizes ===== | ||
+ | <code sql> | ||
+ | SELECT pg_database.datname, | ||
+ | pg_size_pretty(pg_database_size(pg_database.datname)) AS size | ||
+ | FROM pg_database | ||
+ | ORDER BY pg_database_size(pg_database.datname) desc; | ||
+ | </code> | ||
+ | |||
+ | ===== Find large databases ===== | ||
+ | <code sql> | ||
+ | SELECT d.datname AS Name, pg_catalog.pg_get_userbyid(d.datdba) AS Owner, | ||
+ | CASE WHEN pg_catalog.has_database_privilege(d.datname, 'CONNECT') | ||
+ | THEN pg_catalog.pg_size_pretty(pg_catalog.pg_database_size(d.datname)) | ||
+ | ELSE 'No Access' | ||
+ | END AS Size | ||
+ | FROM pg_catalog.pg_database d | ||
+ | ORDER BY | ||
+ | CASE WHEN pg_catalog.has_database_privilege(d.datname, 'CONNECT') | ||
+ | THEN pg_catalog.pg_database_size(d.datname) | ||
+ | ELSE NULL | ||
+ | END DESC -- nulls first | ||
+ | LIMIT 20 | ||
+ | </code> | ||
+ | |||
+ | see also: https://wiki.postgresql.org/wiki/Disk_Usage | ||
+ | |||
+ | ===== Find large tables (in one DB) ===== | ||
+ | <code sql> | ||
+ | SELECT nspname || '.' || relname AS "relation", | ||
+ | pg_size_pretty(pg_relation_size(C.oid)) AS "size" | ||
+ | FROM pg_class C | ||
+ | LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace) | ||
+ | WHERE nspname NOT IN ('pg_catalog', 'information_schema') | ||
+ | ORDER BY pg_relation_size(C.oid) DESC | ||
+ | LIMIT 20; | ||
+ | </code> | ||
+ | |||
+ | |||
+ | ===== Show Table size ===== | ||
+ | |||
+ | <code sql> | ||
+ | -- size of data | ||
+ | SELECT pg_size_pretty(pg_total_relation_size('<TABLE_NAME>')); | ||
+ | |||
+ | -- size of indices | ||
+ | SELECT c2.relname, c2.relpages * 8 / 1024 AS "MB" | ||
+ | FROM pg_class c, pg_class c2, pg_index i | ||
+ | WHERE c.relname = '<TABLE_NAME>' AND | ||
+ | c.oid = i.indrelid AND | ||
+ | c2.oid = i.indexrelid | ||
+ | ORDER BY c2.relname; | ||
+ | </code> | ||
+ | ====== SQL ====== | ||
+ | ===== Date and Time ===== | ||
+ | * http://www.postgresql.org/docs/9.1/interactive/datatype-datetime.html | ||
+ | * http://www.postgresql.org/docs/9.1/static/functions-datetime.html | ||
+ | |||
+ | ==== Add 15-min interval to time ==== | ||
+ | <code sql> | ||
+ | SELECT DATE(timestamp) + (interval_id) * time '00:15:00' FROM ... | ||
+ | </code> | ||
+ | ==== Extract date-parts ==== | ||
+ | <code sql> | ||
+ | select extract(minute from timestamp '2008-08-01 12:14:13'); | ||
+ | </code> | ||
+ | see http://www.postgresql.org/docs/8.3/static/functions-datetime.html#FUNCTIONS-DATETIME-EXTRACT | ||
+ | |||
+ | ==== Extract TIME from TIMESTAMP ==== | ||
+ | <code sql> | ||
+ | SELECT "time"(mytimestamp) FROM table; | ||
+ | </code> | ||
+ | |||
+ | ==== Convert Timestamp to Seconds ==== | ||
+ | To convert a timestamp / interval to seconds (in case of timestamp seonds since 01-01-1970) | ||
+ | |||
+ | <code sql> | ||
+ | SELECT EXTRACT (EPOCH FROM timestamp_or_interval ); | ||
+ | </code> | ||
+ | |||
+ | ==== Creating Timestamps from Text Fields with missing leading Zeros ==== | ||
+ | |||
+ | If your data looks like this: | ||
+ | |||
+ | "2011/3/15 ";"6:17:3 " | ||
+ | |||
+ | You can construct timestamps using the "FM" (fill mode) modifier: | ||
+ | |||
+ | <code sql> | ||
+ | SELECT to_timestamp(utc_date|' '|utc_time,'YYYY/FMMM/FMDD FMHH24:FMMI:FMSS' ); | ||
+ | </code> | ||
+ | |||
+ | |||
+ | ==== Update parts of date ==== | ||
+ | <code sql> | ||
+ | update datetesting set d = to_timestamp( to_char (d,'YYYY-MM-DD 12:00:00'),'YYYY-MM-DD HH24:MI:SS' ) ; | ||
+ | </code> | ||
+ | |||
+ | |||
+ | ==== Time Zone Information ==== | ||
+ | Boy that was confusing so here it goes... | ||
+ | |||
+ | === Converting from string to TIMESTAMP === | ||
+ | |||
+ | |||
+ | <code sql> | ||
+ | select '2013-05-18 12:30:00'::TIMESTAMP WITH TIME ZONE ; | ||
+ | -- 2013-05-18 12:30:00+02 no timezone info provided. Postgres simply assumes the current TIME ZONE | ||
+ | |||
+ | select'2013-05-18 12:30:00+00'::TIMESTAMP WITH TIME ZONE as with_tz, '2013-05-18 12:30:00+05'::TIMESTAMP WITHOUT TIME ZONE as without_tz; | ||
+ | -- with_tz | without_tz | ||
+ | ------------------------+--------------------- | ||
+ | -- 2013-05-18 14:30:00+02 | 2013-05-18 12:30:00 | ||
+ | </code> | ||
+ | |||
+ | **Notes** | ||
+ | * When no TIME ZONE info is provided, but Postgres wants TIME ZONE info, then it simply assumes the current TIME ZONE of the session. | ||
+ | * Timestamps WITH TIME ZONE are displayed in the TIME ZONE of the current session. Unless explicitly requested otherwise (see later). | ||
+ | * The TIME ZONE part of a string will simply be **ignored** if you are converting to TIMESTAMP WITHOUT TIME ZONE | ||
+ | * this also applies when you insert to a column of type TIMESTAMP WITHOUT TIME ZONE | ||
+ | |||
+ | <code sql> | ||
+ | SELECT '2013-05-18 12:30:00+06'::TIMESTAMP; | ||
+ | -- 2013-05-18 12:30:00 --> the +06 has been completely ignored | ||
+ | |||
+ | |||
+ | SELECT '2013-05-18 12:30:00+06'::TIMESTAMP WITH TIME ZONE; | ||
+ | -- 2013-05-18 08:30:00+02 --> 6:30 UTC --> 8:30CET (+02) | ||
+ | |||
+ | </code> | ||
+ | |||
+ | |||
+ | === Explicitly setting the TIME ZONE === | ||
+ | <code sql> | ||
+ | set TIME ZONE 'UTC'; | ||
+ | select '2013-05-18 12:30:00+00'::TIMESTAMP WITH TIME ZONE; | ||
+ | -- 2013-05-18 12:30:00+00 | ||
+ | |||
+ | |||
+ | |||
+ | set TIME ZONE 'CET'; | ||
+ | select '2013-05-18 12:30:00+00'::TIMESTAMP WITH TIME ZONE; | ||
+ | -- 2013-05-18 14:30:00+02 | ||
+ | |||
+ | |||
+ | </code> | ||
+ | |||
+ | |||
+ | |||
+ | === Converting from epoch values to TIMESTAMP === | ||
+ | <code sql> | ||
+ | select to_timestamp(1358749391); | ||
+ | |||
+ | --> "2013-01-21 07:23:11+01" | ||
+ | </code> | ||
+ | |||
+ | **Notes:** | ||
+ | * Postgres converts the epoch value to TIMESTAMP WITH TIME ZONE. Thats why the result is displayed in the local time zone (CEST, winter time). | ||
+ | |||
+ | === Discard TIME ZONE Information === | ||
+ | <code sql> | ||
+ | select to_timestamp(1358749391)::timestamp without time zone; | ||
+ | --> "2013-01-21 07:23:11" | ||
+ | </code> | ||
+ | **Notes** | ||
+ | * does the same thing as before, but also discards the time zone information and returns a TIMESTAMP WITHOUT TIME ZONE | ||
+ | |||
+ | === Convert to other TIME ZONE === | ||
+ | <code sql> | ||
+ | select | ||
+ | to_timestamp(1358749391) at time zone 'utc'; | ||
+ | |||
+ | --> "2013-01-21 06:23:11" | ||
+ | </code> | ||
+ | |||
+ | **Notes** | ||
+ | * Displays the value at time zone UTC --> 6:23 | ||
+ | * The result is a TIMESTAMP WITHOUT TIME ZONE. In this case postgres effectively removes the time zone information from the result | ||
+ | * see http://www.postgresql.org/docs/9.2/static/functions-datetime.html#FUNCTIONS-DATETIME-ZONECONVERT | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | ===== Useful Functions ===== | ||
+ | ==== Aggregate Concat ==== | ||
+ | <code sql> | ||
+ | |||
+ | CREATE AGGREGATE textcat_all( | ||
+ | basetype = text, | ||
+ | sfunc = textcat, | ||
+ | stype = text, | ||
+ | initcond = '' | ||
+ | ); | ||
+ | </code> | ||
+ | **Usage:** | ||
+ | |||
+ | <code sql> | ||
+ | SELECT company_id, textcat_all(employee | ', ') FROM mytable GROUP BY company_id; | ||
+ | </code> | ||
+ | |||
+ | ==== row_number ==== | ||
+ | |||
+ | Is e.g. useful if you need an OID where none is present: | ||
+ | |||
+ | <code sql> | ||
+ | SELECT row_number() OVER (ORDER BY NULL) AS oid FROM table; | ||
+ | </code> | ||
+ | |||
+ | the "order by" clause is required, but in this special case we do not require any particular order, so we use NULL here. | ||
+ | |||
+ | ==== Round time to interval ==== | ||
+ | <code sql> | ||
+ | CREATE OR REPLACE FUNCTION round_time_to_interval(TIMESTAMP) | ||
+ | RETURNS TIMESTAMP AS $$ | ||
+ | SELECT date_trunc('hour', $1) + INTERVAL '15 min' * FLOOR(date_part('minute', $1) / 15.0) | ||
+ | $$ LANGUAGE SQL; | ||
+ | </code> | ||
+ | |||
+ | |||
+ | ==== Rows to Columns ==== | ||
+ | This is an obscure one, but it is very handy if you must transform a number of rows in a number of columns. This can be achieved with the crosstab() function provided by the tablefunc package of postgres-contrib. So first we need to install postgresql-contrib and enable the function in our db: | ||
+ | |||
+ | <code bash> | ||
+ | sudo apt-get install postgresql-contrib | ||
+ | sudo service postgresql-8.4 restart | ||
+ | cd /usr/share/postgresql/8.4/contrib | ||
+ | psql -U username -d dbname -h localhost -f tablefunc.sql | ||
+ | </code> | ||
+ | |||
+ | Next have a look at the documentation of crosstab here: http://www.postgresql.org/docs/8.4/static/tablefunc.html | ||
+ | |||
+ | |||
+ | Here is an example of how to aggregate a traffic state from rows to columns (for a number of known roads). | ||
+ | |||
+ | <code sql> | ||
+ | SELECT * FROM | ||
+ | crosstab( | ||
+ | 'SELECT valid_from, road_id, q from ait.its_traffic_state WHERE road_id IN (23005242,23008577) order by 1', | ||
+ | 'SELECT distinct road_id FROM ait.its_traffic_state WHERE road_id IN (23005242,23008577) order by 1' | ||
+ | |||
+ | ) | ||
+ | AS its_traffic_state(time timestamp, road1 integer, road2 integer); | ||
+ | </code> | ||
+ | ...for a higher number of columns you may need a script to generate the column_name/datatype row at the end of the query (if anyone knows a better solution plz tell). | ||
+ | |||
+ | |||
+ | |||
+ | ===== Updating Sequence Counters ===== | ||
+ | When data is inserted into the database with a specified value for the autoincrementing column the sequence is not touched and must be updated manually. | ||
+ | This is the case when e.g. importing dumps using COPY, or a simple INSERT statement with the autoincrementing column set to a value. | ||
+ | |||
+ | While the existing values (e.g. id 1) are reused the sequence counters stay the same (i.e. start with 0): this prevents new data records from being created as the new object would have a duplicate object id (1 == 1). | ||
+ | |||
+ | The fix: | ||
+ | <code sql> | ||
+ | SELECT SETVAL('sequence_name', MAX(auto_incrementing_column)) FROM table_name; | ||
+ | </code> | ||
+ | |||
+ | |||
+ | ===== SQL Hints ===== | ||
+ | |||
+ | ==== Timeseries completion ==== | ||
+ | Assume you have incomplete timeseries in your database but want to select a complete series (pad the missing values with a Zero). The following query should take care of the padding and replacing missing values by 0-values: | ||
+ | |||
+ | <code sql> | ||
+ | SELECT intervals.intervals as interval, coalesce(q,0) | ||
+ | FROM timeseries t | ||
+ | RIGHT OUTER JOIN generate_series(0,24) intervals | ||
+ | ON t.interval = intervals.intervals ; | ||
+ | </code> | ||
+ | |||
+ | ==== Timeseries creation ==== | ||
+ | To create a series of timestamps | ||
+ | <code> | ||
+ | select * from | ||
+ | generate_series('2013-11-01'::timestamp, '2013-11-02'::timestamp, '15 minutes') | ||
+ | </code> | ||
+ | ==== Convert Array to Rows ==== | ||
+ | This might be useful to replace WHERE x IN ('a','b','c') statements (which are notoriously slow in postgres) with JOINs. | ||
+ | |||
+ | <code sql> | ||
+ | SELECT i, (ARRAY['a','b','c'])[i] | ||
+ | FROM generate_series(1,array_upper(ARRAY['a','b','c'],1)) i; | ||
+ | |||
+ | |||
+ | i | array | ||
+ | ---+------- | ||
+ | 1 | a | ||
+ | 2 | b | ||
+ | 3 | c | ||
+ | |||
+ | (3 rows) | ||
+ | </code> | ||
+ | |||
+ | It is not very elegant though especially as you need to specify the array twice (or know the size of the array, then you can replace the array_upper statement with the size). If someone has a better solution I would be interested :) | ||
+ | |||
+ | ====== Stored Procedures ====== | ||
+ | ===== PL/Python ===== | ||
+ | * http://www.postgresql.org/docs/8.3/interactive/plpython.html | ||
+ | |||
+ | ==== Aggreagte Functions in PL/Python ==== | ||
+ | * Aggregates in Postgres (Basics) : [[http://www.postgresonline.com/periodical.php?i_id=7#item-67]] | ||
+ | * PL/Python Specific : [[http://www.postgresonline.com/journal/index.php?/archives/102-PLPython-Part-4-PLPython-meets-aggregates.html ]] | ||
+ | |||
+ | ===== PL/Java ===== | ||
+ | ==== Installing ==== | ||
+ | Download the most recent version from http://pgfoundry.org/frs/?group_id=1000038 . Most of the instructions in this sections are taken from the README file in the tarball. | ||
+ | * unzip pljava to /opt/pljava | ||
+ | * sudo vi /etc/postgresql/8.3/main/postgresql.conf | ||
+ | |||
+ | <code> | ||
+ | SET the following parameters: | ||
+ | dynamic_library_path = '$libdir:/opt/pljava' | ||
+ | log_min_messages = info #optional | ||
+ | custom_variable_classes = 'pljava' | ||
+ | |||
+ | ADD the following parameters: | ||
+ | pljava.classpath = '/opt/pljava/pljava.jar' | ||
+ | </code> | ||
+ | * update ldconfig so postgres finds the reqired shared objects. Create a file **/etc/ld.so.conf.d/pljava.conf** containing | ||
+ | |||
+ | . /usr/lib/jvm/java-6-sun/jre/lib/amd64 /usr/lib/jvm/java-6-sun/jre/lib/amd64/server /usr/lib/jvm/java-6-sun/jre/lib/amd64/native_threads | ||
+ | * run | ||
+ | |||
+ | <code bash> | ||
+ | sudo ldconfig | ||
+ | </code> | ||
+ | * install pl/java | ||
+ | |||
+ | <code bash> | ||
+ | sudo su postgres | ||
+ | psql -f install.sql | ||
+ | </code> | ||
+ | ===== PL/R ===== | ||
+ | ==== Installing ==== | ||
+ | * Install the postgres-plr package | ||
+ | |||
+ | . sudo aptitude install postgresql-8.3-plr | ||
+ | * On ubuntu 8.04 the packet does not exist but the Jaunty Version works just fine. Search for the correct package on http://packages.ubuntu.com/search?keywords=postgresql-8.3-plr then do something like | ||
+ | |||
+ | <code> | ||
+ | wget http://de.archive.ubuntu.com/ubuntu/pool/universe/p/plr/postgresql-8.3-plr_8.3.0.6-3_amd64.deb | ||
+ | dpkg -i postgresql-8.3-plr_8.3.0.6-3_amd64.deb | ||
+ | </code> | ||
+ | * Install PL/R in the database of choice | ||
+ | |||
+ | <code> | ||
+ | psql <dbname> < /usr/share/postgresql/8.3/plr.sql | ||
+ | </code> | ||
+ | * Test | ||
+ | |||
+ | <code sql> | ||
+ | CREATE OR REPLACE FUNCTION r_max (integer, integer) RETURNS integer AS ' | ||
+ | if (arg1 > arg2) | ||
+ | return(arg1) | ||
+ | else | ||
+ | return(arg2) | ||
+ | ' LANGUAGE 'plr' STRICT; | ||
+ | |||
+ | SELECT r_max(10,20); | ||
+ | </code> | ||
+ | ==== Using ==== | ||
+ | Have a look at the guide here http://www.joeconway.com/plr/doc/index.html | ||
+ | |||
+ | ====== Useful commands ====== | ||
+ | ===== Inserting from CSV Files ===== | ||
+ | To efficiently insert a lot of records from csv-files into the table "tolldata" use the [[http://www.postgresql.org/docs/9.1/static/sql-copy.html]] COPY command: | ||
+ | |||
+ | <code bash> | ||
+ | cat *.csv | psql -c "COPY tolldata FROM stdin WITH DELIMITER ';'" -d toll_tst -U toll -h localhost | ||
+ | </code> | ||
+ | Files named in a COPY command are read or written directly by the server, not by the client application. Therefore, they must //reside on or be accessible to the database server machine//, not the client. They must be accessible to and readable or writable by the PostgreSQL user (the user ID the server runs as), not the client. COPY naming a file is only allowed to database superusers, since it allows reading or writing any file that the server has privileges to access. | ||
+ | |||
+ | ===== Outputting in CSV Format ===== | ||
+ | If you need to export an entire table | ||
+ | |||
+ | <code> | ||
+ | COPY table TO '/tmp/table.csv' DELIMITER ',' CSV HEADER; | ||
+ | </code> | ||
+ | |||
+ | If query is more complicated: | ||
+ | |||
+ | <code sql> | ||
+ | --- toggle to unaligned output mode | ||
+ | \a | ||
+ | --- set ; as field delimiter | ||
+ | \f ; | ||
+ | --- disable the footer (row count) | ||
+ | \pset footer off | ||
+ | |||
+ | SELECT * FROM table \g file.csv | ||
+ | </code> | ||
+ | ===== Piping query result to file ===== | ||
+ | In psql command line: | ||
+ | |||
+ | <code sql> | ||
+ | SELECT * FROM table \g filename | ||
+ | </code> | ||
+ | ([[http://archives.postgresql.org/pgsql-general/2001-12/msg00687.php see more details]]) | ||
+ | |||
+ | ===== Show Tables ===== | ||
+ | <code sql> | ||
+ | \dt | ||
+ | or | ||
+ | \dt my_schema | ||
+ | </code> | ||
+ | ===== drop all tables in a schema ===== | ||
+ | <code sql> | ||
+ | select 'drop table '|schemaname|'.'|tablename|';' from pg_tables where schemaname in ('public') order by schemaname, tablename; | ||
+ | </code> | ||
+ | ===== drop all sequences in a schema ===== | ||
+ | <code sql> | ||
+ | select 'drop sequence public.'|sequence_name|';' from information_schema.sequences where sequence_schema in ('public') order by sequence_name; | ||
+ | </code> | ||
+ | ===== Grant on all tables ===== | ||
+ | <code sql> | ||
+ | select 'grant select, insert, update, delete on ' | tablename | ' to muh;' from pg_tables where schemaname = 'public'; | ||
+ | </code> | ||
+ | |||
+ | ====== Backups ====== | ||
+ | |||
+ | There are many options available on how to back up your databases. Postgres already contains some tools itself: | ||
+ | * ''pg_dump'': per-database dumping of db contents | ||
+ | * ''pg_basebackup'': binary backup of whole postgres clusters (per db backup is not possible) | ||
+ | |||
+ | However, especially for larger databases (hundreds of GB) more powerful third-party tools featuring parallelism and incremental backups are [[http://bonesmoses.org/2017/02/03/pg-phriday-getting-back-up|recommended]]: | ||
+ | * [[https://pgbackrest.org|pgBackRest]] | ||
+ | * [[https://www.pgbarman.org|pgBarman]] | ||
+ | |||
+ | |||
+ | ===== pg_dump ===== | ||
+ | |||
+ | It makes sense to use the postgres user | ||
+ | <code bash> | ||
+ | sudo su postgres | ||
+ | </code> | ||
+ | |||
+ | There are two important variants of making backups: | ||
+ | |||
+ | Create and restore a backup using plain-text sql: ''pg_dump'' and ''psql'' | ||
+ | <code bash> | ||
+ | pg_dump -U user source_db -f db_backup.sql | ||
+ | |||
+ | # restore | ||
+ | createdb target_db | ||
+ | psql -U user -d target_db -f db_backup.sql | ||
+ | </code> | ||
+ | |||
+ | Create and restore a backup using the compressed 'custom' format: ''pg_dump'' and ''pg_restore'' | ||
+ | <code bash> | ||
+ | pg_dump -Fc -U user source_db > db_backup.dmp | ||
+ | |||
+ | # restore (requires an empty database) | ||
+ | createdb target_db | ||
+ | pg_restore -Fc -d target_db db_backup.dmp | ||
+ | </code> | ||
+ | |||
+ | In addition we can use gzip (and zcat) to further compress the compressed 'custom' format and give the dump a nice name: | ||
+ | <code bash> | ||
+ | pg_dump -Fc -U user source_db | gzip > "$(date +%Y-%m-%d)_db_backup.dmp.gz" | ||
+ | zcat YYYY-MM-DD_db_backup.dmp.gz | pg_restore -Fc -d target_db | ||
+ | </code> | ||
+ | |||
+ | Or, when creating the backup via cronjob in ''/etc/cron.d'': | ||
+ | <code> | ||
+ | 0 5 * * * postgres pg_dump -Fc dbname | gzip > /path/to/db_backup_`date +\%Y-\%m-\%dT\%H_\%M`.dmp.gz | ||
+ | # ┬ ┬ ┬ ┬ ┬ ┬ ┬ | ||
+ | # │ │ │ │ | │ └ command | ||
+ | # │ │ │ │ | └───── user executing the command | ||
+ | # │ │ │ │ └───────── day of week (0 - 7) (0 to 7 are Sunday to Sunday, or use names) | ||
+ | # │ │ │ └───────────── month (1 - 12) | ||
+ | # │ │ └───────────────── day of month (1 - 31) | ||
+ | # │ └───────────────────── hour (0 - 23) | ||
+ | # └───────────────────────── min (0 - 59) | ||
+ | </code> | ||
+ | |||
+ | ===== pg_basebackup ===== | ||
+ | |||
+ | see e.g. [[https://blog.sleeplessbeastie.eu/2016/02/15/how-to-perform-base-postgresql-backup-using-pg_basebackup-utility|this tutorial]] | ||
+ | |||
+ | ===== pgBackRest ===== | ||
+ | |||
+ | A typical use of pgBackRest would be to install a backup server that backs up several DB servers. For now we only use a simpler approach: pgBackRest backing up on localhost. For more see the [[https://pgbackrest.org/user-guide.html|excellent user guide]] or [[https://pgstef.github.io/2018/01/04/introduction_to_pgbackrest.html|this good introduction blog post]]. | ||
+ | |||
+ | Install from postgres repo: | ||
+ | |||
+ | <code bash> | ||
+ | sudo add-apt-repository "deb http://apt.postgresql.org/pub/repos/apt/ xenial-pgdg main" | ||
+ | wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add - | ||
+ | sudo apt update | ||
+ | sudo apt install pgbackrest | ||
+ | </code> | ||
+ | |||
+ | ==== Incremental Backups on Localhost ==== | ||
+ | |||
+ | In addition to the global section, where e.g. retention is configured, you must add a so-called "stanza" for each cluster you want to back up. Make sure the repo1-path exists and is owned by user postgres. | ||
+ | |||
+ | <file - /etc/pgbackrest/pgbackrest.conf> | ||
+ | [global] | ||
+ | repo1-path=/var/lib/pgbackrest | ||
+ | repo1-retention-full=5 | ||
+ | repo1-retention-diff=1 | ||
+ | |||
+ | [STANZANAME] | ||
+ | pg1-path=/var/lib/postgresql/9.5/CLUSTERNAME | ||
+ | </file> | ||
+ | |||
+ | Then adjust the postgres config to allow for proper backups | ||
+ | |||
+ | <file - /etc/postgresql/9.5/main/postgresql.conf> | ||
+ | listen_addresses = '*' | ||
+ | |||
+ | # WRITE AHEAD LOG | ||
+ | wal_level = hot_standby | ||
+ | archive_mode = on | ||
+ | archive_command = 'pgbackrest --stanza=STANZANAME archive-push %p' | ||
+ | |||
+ | # REPLICATION | ||
+ | max_wal_senders = 3 | ||
+ | |||
+ | # ERROR REPORTING AND LOGGING | ||
+ | log_line_prefix = '' | ||
+ | </file> | ||
+ | |||
+ | Reload the cluster (or the complete postgres server) to activate changes | ||
+ | |||
+ | sudo systemctl restart postgresql@9.5-CLUSTERNAME | ||
+ | | ||
+ | Create and test the pgbackrest stanza | ||
+ | |||
+ | sudo -u postgres pgbackrest --stanza=STANZANAME --log-level-console=info stanza-create | ||
+ | sudo -u postgres pgbackrest --stanza=STANZANAME --log-level-console=info check | ||
+ | |||
+ | Make backups manually (e.g. from a cronjob) | ||
+ | |||
+ | sudo -u postgres pgbackrest --stanza=STANZANAME --type=full --log-level-console=detail backup | ||
+ | sudo -u postgres pgbackrest --stanza=STANZANAME --type=incr --log-level-console=detail backup | ||
+ | |||
+ | Make **regular backups via cronjob** | ||
+ | |||
+ | <file - /etc/cron.d/pgbackrest> | ||
+ | #minute hour day-of-month month day-of-week user command | ||
+ | 0 4 * * 0 postgres pgbackrest --type=full --stanza=STANZANAME backup | ||
+ | 0 4 * * 1-6 postgres pgbackrest --type=diff --stanza=STANZANAME backup | ||
+ | </file> | ||
+ | |||
+ | **Restore latest backup.** The cluster must be stopped and the cluster directory must exist and be empty. | ||
+ | |||
+ | sudo systemctl stop postgresql@9.5-CLUSTERNAME | ||
+ | sudo mv /var/lib/postgresql/9.5/CLUSTERNAME /opt/backups/CLUSTERNAME_backup | ||
+ | sudo -u postgres mkdir /var/lib/postgresql/9.5/CLUSTERNAME | ||
+ | sudo chmod 0700 /var/lib/postgresql/9.5/CLUSTERNAME | ||
+ | sudo -u postgres pgbackrest --stanza=STANZANAME --log-level-console=detail restore | ||
+ | sudo systemctl start postgresql@9.5-CLUSTERNAME | ||
+ | |||
+ | To **restore a specific backup** use ''%%--set%%'' with the name of a folder in ''/var/lib/pgbackrest/backup/STANZANAME'', e.g.: | ||
+ | |||
+ | sudo -u postgres pgbackrest --set=20180724-144511F --stanza=STANZANAME --log-level-console=detail restore | ||
+ | |||
+ | You can also copy the contents of ''/var/lib/pgbackrest'' to a different device and restore it there. | ||
+ | |||
+ | Log files can be found in ''/var/log/pgbackrest''. | ||
+ | ====== Performance Tuning ====== | ||
+ | |||
+ | [[Monitoring Server Activity]] | ||
+ | ===== References ===== | ||
+ | * See http://samiux.wordpress.com/2009/07/26/howto-performance-tuning-for-postgresql-on-ubuntudebian/ for a good introduction and some reasonable default values. | ||
+ | * See http://www.postgresql.org/docs/8.4/interactive/kernel-resources.html#SYSVIPC for Kernel Parameters that may be relevant when tuning Postgres Performance on Linux Systems | ||
+ | * Note that while SHMMAX is measured in bytes SHMALL is measured in **Pages** | ||
+ | * The default page size on a 32bit Linux system is usually 4096 bytes (check with the command //getconf PAGE_SIZE//). So if you change SHMMAX you should also check that // SHMALL >= SHMMAX / PAGE_SIZE // | ||