For Ubuntu 12.04:
sudo apt-get install postgresql-9.1 pgadmin3
pgAdmin III is a strongly recommended GUI for twiddling with your PostgreSQL. However, here are the first steps with the command line right after installing.
Allow TCP/IP connections from all IPs (default = localhost only) by adding this line to /etc/postgresql/9.1/main/postgresql.conf
:
listen_addresses = '*'
Then allow password authentication for all users from all IPs by adding this line to /etc/postgresql/9.1/main/pg_hba.conf
:
host all all 0.0.0.0/0 md5
To further twiddle with more restrictive client authentication settings have a look at pg_hba.conf.
First steps with a user of your choosing
# log in with user created by default (which has no password set) sudo -u postgres psql # create a user with your linux user name and log in # (in Ubuntu the postgres user that matches your linux user name # can by default log in without supplying a password) sudo -u postgres createuser --superuser $USER psql -d postgres # set a password \password # make an existing user superuser (or remove superuser rights) sudo -u postgres psql -c "ALTER USER theuser WITH SUPERUSER;" sudo -u postgres psql -c "ALTER USER theuser WITH NOSUPERUSER;" # create a restricted user with password (long and short version) sudo -u postgres createuser --no-superuser --no-createdb --no-createrole --pwprompt username sudo -u postgres createuser -SDRP username psql -U username -d postgres
# drop a user sudo -u postgres dropuser username
\password postgres
1. Create the user
sudo -u postgres createuser --no-superuser --no-createdb --no-createrole --pwprompt username
2. Grant readonly permissions
This is the tedious part (in postgres versions pre 9.0). We need to grant the user the permission to
GRANT CONNECT ON DATABASE mydb TO username; GRANT USAGE ON SCHEMA public TO username; -- Generate GRANT SELECT statements for all tables/views in the schema SELECT 'GRANT SELECT ON public.' || relname || ' TO username;' FROM pg_class JOIN pg_namespace ON pg_namespace.oid = pg_class.relnamespace WHERE nspname = 'public' AND relkind IN ('r', 'v');
Creating a true read-only user is a real pain in postgres. Just revoking privileges from a user still allows him to create database objects which he owns. You need to revoke the CREATE and USAGE privileges from the SCHEMA public and the GROUP public.
-- Revoke default permissions from public group: REVOKE CREATE ON SCHEMA public FROM PUBLIC; REVOKE USAGE ON SCHEMA public FROM PUBLIC; -- Add back permissions for the users allowed to create stuff GRANT CREATE ON SCHEMA public TO rw_user; GRANT USAGE ON SCHEMA public TO rw_user; -- Add SELECT permissions for read-only user GRANT USAGE ON SCHEMA public TO readonly_user; GRANT SELECT ON my_table TO readonly_user;
see http://www.ruizs.org/archives/89 for details.
With a single GRANT command permission for all tables in a schema can be changed:
GRANT ALL ON ALL TABLES IN SCHEMA public TO user;
CLI: Unfortunately this can only be achieved via scripting, but we don't have to reinvent the wheel: https://gist.github.com/jirutka/afa3ce62b1430abf7572
GUI: Or in pgadminIII right-click on the schema, go to properties, and set the owner to your liking.
After you logged into PostgreSQL it's good to know some often-used meta-commands (read more in man psql
):
-- list all users \du -- list all schemas \dn -- list tables \d \d+ -- (also show comments) \d tablename -- show details of a table -- list all tables / tables of a certain schema \dt *.* -- list all tables in DB \dt schemaname.* -- list all tables in a certain schema -- list all views \dv *.* -- list all views in DB -- list all databases on the host \l -- connect to a database \c dbname -- execute a command on the Linux shell (as-is) \! command -- quit \q
For more meta-commands check out the documentation.
-- list active connections (and get procpids) SELECT * FROM pg_stat_activity; -- cancel a process but leave the user's session open SELECT pg_cancel_backend( <procpid> ); -- terminate a user's query and session SELECT pg_terminate_backend( <procpid> );
SELECT pg_database.datname, pg_size_pretty(pg_database_size(pg_database.datname)) AS SIZE FROM pg_database ORDER BY pg_database_size(pg_database.datname) DESC;
SELECT d.datname AS Name, pg_catalog.pg_get_userbyid(d.datdba) AS Owner, CASE WHEN pg_catalog.has_database_privilege(d.datname, 'CONNECT') THEN pg_catalog.pg_size_pretty(pg_catalog.pg_database_size(d.datname)) ELSE 'No Access' END AS SIZE FROM pg_catalog.pg_database d ORDER BY CASE WHEN pg_catalog.has_database_privilege(d.datname, 'CONNECT') THEN pg_catalog.pg_database_size(d.datname) ELSE NULL END DESC -- nulls first LIMIT 20
SELECT nspname || '.' || relname AS "relation", pg_size_pretty(pg_relation_size(C.oid)) AS "size" FROM pg_class C LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace) WHERE nspname NOT IN ('pg_catalog', 'information_schema') ORDER BY pg_relation_size(C.oid) DESC LIMIT 20;
-- size of data SELECT pg_size_pretty(pg_total_relation_size('<TABLE_NAME>')); -- size of indices SELECT c2.relname, c2.relpages * 8 / 1024 AS "MB" FROM pg_class c, pg_class c2, pg_index i WHERE c.relname = '<TABLE_NAME>' AND c.oid = i.indrelid AND c2.oid = i.indexrelid ORDER BY c2.relname;
SELECT DATE(TIMESTAMP) + (interval_id) * TIME '00:15:00' FROM ...
SELECT EXTRACT(MINUTE FROM TIMESTAMP '2008-08-01 12:14:13');
see http://www.postgresql.org/docs/8.3/static/functions-datetime.html#FUNCTIONS-DATETIME-EXTRACT
SELECT "time"(mytimestamp) FROM TABLE;
To convert a timestamp / interval to seconds (in case of timestamp seonds since 01-01-1970)
SELECT EXTRACT (EPOCH FROM timestamp_or_interval );
If your data looks like this:
“2011/3/15 ”;“6:17:3 ”
You can construct timestamps using the “FM” (fill mode) modifier:
SELECT to_timestamp(utc_date|' '|utc_time,'YYYY/FMMM/FMDD FMHH24:FMMI:FMSS' );
UPDATE datetesting SET d = to_timestamp( to_char (d,'YYYY-MM-DD 12:00:00'),'YYYY-MM-DD HH24:MI:SS' ) ;
Boy that was confusing so here it goes…
SELECT '2013-05-18 12:30:00'::TIMESTAMP WITH TIME ZONE ; -- 2013-05-18 12:30:00+02 no timezone info provided. Postgres simply assumes the current TIME ZONE SELECT'2013-05-18 12:30:00+00'::TIMESTAMP WITH TIME ZONE AS with_tz, '2013-05-18 12:30:00+05'::TIMESTAMP WITHOUT TIME ZONE AS without_tz; -- with_tz | without_tz ------------------------+--------------------- -- 2013-05-18 14:30:00+02 | 2013-05-18 12:30:00
Notes
SELECT '2013-05-18 12:30:00+06'::TIMESTAMP; -- 2013-05-18 12:30:00 --> the +06 has been completely ignored SELECT '2013-05-18 12:30:00+06'::TIMESTAMP WITH TIME ZONE; -- 2013-05-18 08:30:00+02 --> 6:30 UTC --> 8:30CET (+02)
SET TIME ZONE 'UTC'; SELECT '2013-05-18 12:30:00+00'::TIMESTAMP WITH TIME ZONE; -- 2013-05-18 12:30:00+00 SET TIME ZONE 'CET'; SELECT '2013-05-18 12:30:00+00'::TIMESTAMP WITH TIME ZONE; -- 2013-05-18 14:30:00+02
SELECT to_timestamp(1358749391); --> "2013-01-21 07:23:11+01"
Notes:
SELECT to_timestamp(1358749391)::TIMESTAMP WITHOUT TIME zone; --> "2013-01-21 07:23:11"
Notes
SELECT to_timestamp(1358749391) at TIME zone 'utc'; --> "2013-01-21 06:23:11"
Notes
CREATE AGGREGATE textcat_all( basetype = text, sfunc = textcat, stype = text, initcond = '' );
Usage:
SELECT company_id, textcat_all(employee | ', ') FROM mytable GROUP BY company_id;
Is e.g. useful if you need an OID where none is present:
SELECT ROW_NUMBER() OVER (ORDER BY NULL) AS oid FROM TABLE;
the “order by” clause is required, but in this special case we do not require any particular order, so we use NULL here.
CREATE OR REPLACE FUNCTION round_time_to_interval(TIMESTAMP) RETURNS TIMESTAMP AS $$ SELECT date_trunc('hour', $1) + INTERVAL '15 min' * FLOOR(date_part('minute', $1) / 15.0) $$ LANGUAGE SQL;
This is an obscure one, but it is very handy if you must transform a number of rows in a number of columns. This can be achieved with the crosstab() function provided by the tablefunc package of postgres-contrib. So first we need to install postgresql-contrib and enable the function in our db:
sudo apt-get install postgresql-contrib sudo service postgresql-8.4 restart cd /usr/share/postgresql/8.4/contrib psql -U username -d dbname -h localhost -f tablefunc.sql
Next have a look at the documentation of crosstab here: http://www.postgresql.org/docs/8.4/static/tablefunc.html
Here is an example of how to aggregate a traffic state from rows to columns (for a number of known roads).
SELECT * FROM crosstab( 'SELECT valid_from, road_id, q from ait.its_traffic_state WHERE road_id IN (23005242,23008577) order by 1', 'SELECT distinct road_id FROM ait.its_traffic_state WHERE road_id IN (23005242,23008577) order by 1' ) AS its_traffic_state(TIME TIMESTAMP, road1 INTEGER, road2 INTEGER);
…for a higher number of columns you may need a script to generate the column_name/datatype row at the end of the query (if anyone knows a better solution plz tell).
When data is inserted into the database with a specified value for the autoincrementing column the sequence is not touched and must be updated manually. This is the case when e.g. importing dumps using COPY, or a simple INSERT statement with the autoincrementing column set to a value.
While the existing values (e.g. id 1) are reused the sequence counters stay the same (i.e. start with 0): this prevents new data records from being created as the new object would have a duplicate object id (1 == 1).
The fix:
SELECT SETVAL('sequence_name', MAX(auto_incrementing_column)) FROM TABLE_NAME;
Assume you have incomplete timeseries in your database but want to select a complete series (pad the missing values with a Zero). The following query should take care of the padding and replacing missing values by 0-values:
SELECT intervals.intervals AS INTERVAL, COALESCE(q,0) FROM timeseries t RIGHT OUTER JOIN generate_series(0,24) intervals ON t.interval = intervals.intervals ;
To create a series of timestamps
select * from generate_series('2013-11-01'::timestamp, '2013-11-02'::timestamp, '15 minutes')
This might be useful to replace WHERE x IN ('a','b','c') statements (which are notoriously slow in postgres) with JOINs.
SELECT i, (ARRAY['a','b','c'])[i] FROM generate_series(1,array_upper(ARRAY['a','b','c'],1)) i; i | array ---+------- 1 | a 2 | b 3 | c (3 ROWS)
It is not very elegant though especially as you need to specify the array twice (or know the size of the array, then you can replace the array_upper statement with the size). If someone has a better solution I would be interested :)
Download the most recent version from http://pgfoundry.org/frs/?group_id=1000038 . Most of the instructions in this sections are taken from the README file in the tarball.
SET the following parameters: dynamic_library_path = '$libdir:/opt/pljava' log_min_messages = info #optional custom_variable_classes = 'pljava' ADD the following parameters: pljava.classpath = '/opt/pljava/pljava.jar'
* update ldconfig so postgres finds the reqired shared objects. Create a file /etc/ld.so.conf.d/pljava.conf containing
. /usr/lib/jvm/java-6-sun/jre/lib/amd64 /usr/lib/jvm/java-6-sun/jre/lib/amd64/server /usr/lib/jvm/java-6-sun/jre/lib/amd64/native_threads
sudo ldconfig
* install pl/java
sudo su postgres psql -f install.sql
* Install the postgres-plr package
. sudo aptitude install postgresql-8.3-plr
wget http://de.archive.ubuntu.com/ubuntu/pool/universe/p/plr/postgresql-8.3-plr_8.3.0.6-3_amd64.deb dpkg -i postgresql-8.3-plr_8.3.0.6-3_amd64.deb
* Install PL/R in the database of choice
psql <dbname> < /usr/share/postgresql/8.3/plr.sql
* Test
CREATE OR REPLACE FUNCTION r_max (INTEGER, INTEGER) RETURNS INTEGER AS ' if (arg1 > arg2) return(arg1) else return(arg2) ' LANGUAGE 'plr' STRICT; SELECT r_max(10,20);
Have a look at the guide here http://www.joeconway.com/plr/doc/index.html
To efficiently insert a lot of records from csv-files into the table “tolldata” use the http://www.postgresql.org/docs/9.1/static/sql-copy.html COPY command:
cat *.csv | psql -c "COPY tolldata FROM stdin WITH DELIMITER ';'" -d toll_tst -U toll -h localhost
Files named in a COPY command are read or written directly by the server, not by the client application. Therefore, they must reside on or be accessible to the database server machine, not the client. They must be accessible to and readable or writable by the PostgreSQL user (the user ID the server runs as), not the client. COPY naming a file is only allowed to database superusers, since it allows reading or writing any file that the server has privileges to access.
If you need to export an entire table
COPY table TO '/tmp/table.csv' DELIMITER ',' CSV HEADER;
If query is more complicated:
--- toggle to unaligned output mode \a --- set ; as field delimiter \f ; --- disable the footer (row count) \pset footer off SELECT * FROM TABLE \g file.csv
In psql command line:
SELECT * FROM TABLE \g filename
(http://archives.postgresql.org/pgsql-general/2001-12/msg00687.php see more details)
\dt
OR
\dt my_schema
SELECT 'drop table '|schemaname|'.'|tablename|';' FROM pg_tables WHERE schemaname IN ('public') ORDER BY schemaname, tablename;
SELECT 'drop sequence public.'|sequence_name|';' FROM information_schema.sequences WHERE sequence_schema IN ('public') ORDER BY sequence_name;
SELECT 'grant select, insert, update, delete on ' | tablename | ' to muh;' FROM pg_tables WHERE schemaname = 'public';
There are many options available on how to back up your databases. Postgres already contains some tools itself:
pg_dump
: per-database dumping of db contentspg_basebackup
: binary backup of whole postgres clusters (per db backup is not possible)However, especially for larger databases (hundreds of GB) more powerful third-party tools featuring parallelism and incremental backups are recommended:
It makes sense to use the postgres user
sudo su postgres
There are two important variants of making backups:
Create and restore a backup using plain-text sql: pg_dump
and psql
pg_dump -U user source_db -f db_backup.sql # restore createdb target_db psql -U user -d target_db -f db_backup.sql
Create and restore a backup using the compressed 'custom' format: pg_dump
and pg_restore
pg_dump -Fc -U user source_db > db_backup.dmp # restore (requires an empty database) createdb target_db pg_restore -Fc -d target_db db_backup.dmp
In addition we can use gzip (and zcat) to further compress the compressed 'custom' format and give the dump a nice name:
pg_dump -Fc -U user source_db | gzip > "$(date +%Y-%m-%d)_db_backup.dmp.gz" zcat YYYY-MM-DD_db_backup.dmp.gz | pg_restore -Fc -d target_db
Or, when creating the backup via cronjob in /etc/cron.d
:
0 5 * * * postgres pg_dump -Fc dbname | gzip > /path/to/db_backup_`date +\%Y-\%m-\%dT\%H_\%M`.dmp.gz # ┬ ┬ ┬ ┬ ┬ ┬ ┬ # │ │ │ │ | │ └ command # │ │ │ │ | └───── user executing the command # │ │ │ │ └───────── day of week (0 - 7) (0 to 7 are Sunday to Sunday, or use names) # │ │ │ └───────────── month (1 - 12) # │ │ └───────────────── day of month (1 - 31) # │ └───────────────────── hour (0 - 23) # └───────────────────────── min (0 - 59)
see e.g. this tutorial
A typical use of pgBackRest would be to install a backup server that backs up several DB servers. For now we only use a simpler approach: pgBackRest backing up on localhost. For more see the excellent user guide or this good introduction blog post.
Install from postgres repo:
sudo add-apt-repository "deb http://apt.postgresql.org/pub/repos/apt/ xenial-pgdg main" wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add - sudo apt update sudo apt install pgbackrest
In addition to the global section, where e.g. retention is configured, you must add a so-called “stanza” for each cluster you want to back up. Make sure the repo1-path exists and is owned by user postgres.
[global] repo1-path=/var/lib/pgbackrest repo1-retention-full=5 repo1-retention-diff=1 [STANZANAME] pg1-path=/var/lib/postgresql/9.5/CLUSTERNAME
Then adjust the postgres config to allow for proper backups
listen_addresses = '*' # WRITE AHEAD LOG wal_level = hot_standby archive_mode = on archive_command = 'pgbackrest --stanza=STANZANAME archive-push %p' # REPLICATION max_wal_senders = 3 # ERROR REPORTING AND LOGGING log_line_prefix = ''
Reload the cluster (or the complete postgres server) to activate changes
sudo systemctl restart postgresql@9.5-CLUSTERNAME
Create and test the pgbackrest stanza
sudo -u postgres pgbackrest --stanza=STANZANAME --log-level-console=info stanza-create sudo -u postgres pgbackrest --stanza=STANZANAME --log-level-console=info check
Make backups manually (e.g. from a cronjob)
sudo -u postgres pgbackrest --stanza=STANZANAME --type=full --log-level-console=detail backup sudo -u postgres pgbackrest --stanza=STANZANAME --type=incr --log-level-console=detail backup
Make regular backups via cronjob
#minute hour day-of-month month day-of-week user command 0 4 * * 0 postgres pgbackrest --type=full --stanza=STANZANAME backup 0 4 * * 1-6 postgres pgbackrest --type=diff --stanza=STANZANAME backup
Restore latest backup. The cluster must be stopped and the cluster directory must exist and be empty.
sudo systemctl stop postgresql@9.5-CLUSTERNAME sudo mv /var/lib/postgresql/9.5/CLUSTERNAME /opt/backups/CLUSTERNAME_backup sudo -u postgres mkdir /var/lib/postgresql/9.5/CLUSTERNAME sudo chmod 0700 /var/lib/postgresql/9.5/CLUSTERNAME sudo -u postgres pgbackrest --stanza=STANZANAME --log-level-console=detail restore sudo systemctl start postgresql@9.5-CLUSTERNAME
To restore a specific backup use --set
with the name of a folder in /var/lib/pgbackrest/backup/STANZANAME
, e.g.:
sudo -u postgres pgbackrest --set=20180724-144511F --stanza=STANZANAME --log-level-console=detail restore
You can also copy the contents of /var/lib/pgbackrest
to a different device and restore it there.
Log files can be found in /var/log/pgbackrest
.