Wednesday, June 2, 2021

Use python to plot graphs from csv file - simple example

Excercise: Try plotting the data from csv file using python and save the plot to a jpeg or png file, which we can later make it available for other apps.

csvfile name: test.csv

Location: Current working directory

Python version: 2.7.5

Modules used: matplotlib, numpy

Code:

import numpy as np
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
data = np.genfromtxt("test.csv", delimiter=",", names=["x", "y"])
plt.plot(data['x'], data['y'])
plt.savefig('/tmp/test1.png')

Code explanation:

numpy is needed to read the file in this example
matplotlib is needed for plotting the chart
matplotlib.use('Agg') is needed to supress the requirement of working DISPLAY variable


csv content:

1,2
2,5
3,7
4,7
5,9
6,10
7,20
8,21
9,22
10,30

Output:




Thank you


Wednesday, April 14, 2021

Run Jobs in parallel using Linux Shell Script (multiple loaders)

Objective: 

We have job description in the form of a string contained in a file called job.queue. 
Each string corresponds to a job itself.
We have a lot of jobs in the file job.queue.
If we run them in sequential mode, it will take a lot of time.
We would like to parallelise them using linux shell script here to reduce the overall job run time.

Lab setup:

OS: Centos 7.9.2009
Directory needed: ~/ptest
Files: job.queue, actual script file,job.parallel

Code to create job.queue:

cd ~/ptest;
cat /dev/null > job.queue;
for i in {1..50000}
do
echo job${i} >> job.queue;
done
echo 10 >job.parallel;


Code to launch the jobs in parallel:

cd ~/ptest;

#!/bin/ksh
#code for parallel processing - model 2
cd ~/ptest;
cat /dev/null > job.processed;
cat /dev/null > job.tracker;
cat /dev/null > job.lock;
ldcnt=5;
export ldcnt;
#Module to reset all operational files to the count of loaders.
for i in $(eval echo "{1..${ldcnt}}")
do
cat /dev/null > job.processed.th${i};
cat /dev/null > job.tracker.th${i};
cat /dev/null > job.lock.th${i};
rm job.parallel.th${i} >/dev/null 2>&1;
rm job.queue.th${i} >/dev/null 2>&1;
done
if ! [ -f job.parallel ];
then
    echo 1 > job.parallel;
fi
fsq()
{
exec 200>job.lock.${thn};
while true
do
    flock -xn 200 && break || sleep .01;
done
echo $key1,processed >> job.processed.${thn};
200>&-
}
loader()
{
typeset -i i;
i=0;
while read line
do
    pth=$(cat job.parallel.${thn});
    if [ $i -ge ${pth} ];
    then
        while true
        do
            if [ $i -lt ${pth} ];
            then
                break;
            else
                sleep .01;
                i=$(ps -ef|grep -i fsq|grep ${thn}|grep -v grep|wc -l);
            fi
        done
    fi
    ((i++));
    key1=$line;
    export -f fsq;
    export key1;
    { nohup bash -c fsq ${thn} & } >/dev/null 2>&1;
    jid=$(echo $!);
    echo $line - submitted under $jid|tee -a job.tracker.${thn};
done < job.queue.${thn}
}
#Module to launch the loaders
typeset -i j;
j=0;
#Module to break the work unit
lcnt=$(cat job.queue|wc -l);
spltcnt=$(echo "scale=0; $(bc -l <<< ${lcnt}+${ldcnt}-1)/${ldcnt}"|bc);
awk -v s=${spltcnt} 'NR%s==1 { file = FILENAME  ".th" sprintf("%d", (NR/s)+1) } { print > file }' job.queue;
#parallel thread distributor
pth=$(cat job.parallel);
typeset -i m;
m=0;
if [ ${pth} -ge ${ldcnt} ];
then
    for l in $(eval echo "{1..${pth}}")
    do
        ((m++));
        if [ -f job.parallel.th${m} ];
        then
            cupval=$(cat job.parallel.th${m});
            newval=$(bc -l <<< ${cupval}+1);
            echo ${newval} > job.parallel.th${m};
        else
            echo 1 > job.parallel.th${m};
        fi
        if [ ${m} -ge ${ldcnt} ];
        then
            m=0;
        fi
    done
else
    for l in $(eval echo "{1..${ldcnt}}")
    do
        ((m++));
        if [ ${pth} -gt 0 ];
        then
            echo 1 > job.parallel.th${m};
        else
            echo 0 > job.parallel.th${m};
        fi
        ((pth--));
    done
fi
#loader launcher
for k in $(ls -tr job.queue.th*)
do
    thn=$(echo $k|cut -d '.' -f 3);
    export thn;
    export -f loader;
    { nohup bash -c loader ${thn} & } >/dev/null 2>&1;
done

Throughput Chart for Multiple Loader (aka - the above script):

# of jobs/sec is shown below...


Thank you for your visit.

We will meet in next blog.

Monday, March 1, 2021

Install & Setup Hadoop in Standalone Mode

Install & Setup Hadoop in Standalone Mode

Environment:
Os: Centos 7.9
Kernel: 3.10.0-1160.6.1.el7.x86_64
Java version:
openjdk version "1.8.0_282"
OpenJDK Runtime Environment (build 1.8.0_282-b08)
OpenJDK 64-Bit Server VM (build 25.282-b08, mixed mode)

1) Setup hadoop user in centos machine(follow blog link below)

https://oracledbaplanner.blogspot.com/2021/02/adding-linux-usergroup-and-modifying.html

2) Setup mountpoint for Apache hadoop install

[root@localhost ~]# mkdir /opt/hadoop

[root@localhost ~]# chown hadoop:bigdata /opt/hadoop

[root@localhost ~]# ls -altr /opt/hadoop
total 0
drwxr-xr-x. 4 root   root    53 Mar  1 02:14 ..
drwxr-xr-x. 2 hadoop bigdata  6 Mar  1 02:14 .
[root@localhost ~]#

3) Download the hadoop binary from url https://downloads.apache.org/hadoop/common/
Specifically like below..


wget https://downloads.apache.org/hadoop/common/hadoop-3.3.0/hadoop-3.3.0.tar.gz

4) Once download is finished, gzip decompress and untar the binary

[hadoop@localhost hadoop]$ ls -altr
total 489016
-rw-r--r--. 1 hadoop bigdata 500749234 Jul 15  2020 hadoop-3.3.0.tar.gz
drwxr-xr-x. 4 root   root           53 Mar  1 02:14 ..
drwxr-xr-x. 2 hadoop bigdata        33 Mar  1 02:32 .
[hadoop@localhost hadoop]$ gzip -d hadoop-3.3.0.tar.gz

[hadoop@localhost hadoop]$ ls -altr
total 1034752
-rw-r--r--. 1 hadoop bigdata 1059584000 Jul 15  2020 hadoop-3.3.0.tar
drwxr-xr-x. 4 root   root            53 Mar  1 02:14 ..
drwxr-xr-x. 2 hadoop bigdata         30 Mar  1 23:13 .

[hadoop@localhost hadoop]$ pwd
/opt/hadoop

[hadoop@localhost hadoop]$ ls -altr
total 1034752
-rw-r--r--. 1 hadoop bigdata 1059584000 Jul 15  2020 hadoop-3.3.0.tar
drwxr-xr-x. 4 root   root            53 Mar  1 02:14 ..
drwxr-xr-x. 2 hadoop bigdata         30 Mar  1 23:13 .
[hadoop@localhost hadoop]$ tar -tvf hadoop-3.3.0.tar|head
drwxr-xr-x brahma/brahma     0 2020-07-06 15:50 hadoop-3.3.0/
-rw-rw-r-- brahma/brahma   175 2020-03-24 13:23 hadoop-3.3.0/README.txt
..

I evaluated the nohup.out to see if there are any errors reported during in untar operation.So the file is nearly 1GB in size. After untar the total usage of the directory is 2GB in size.

[hadoop@localhost hadoop]$ ls -altr
total 1038160
drwxr-xr-x. 10 hadoop bigdata        215 Jul  6  2020 hadoop-3.3.0
-rw-r--r--.  1 hadoop bigdata 1059584000 Jul 15  2020 hadoop-3.3.0.tar
drwxr-xr-x.  4 root   root            53 Mar  1 02:14 ..
drwxr-xr-x.  3 hadoop bigdata         67 Mar  1 23:13 .
-rw-------.  1 hadoop bigdata    3486147 Mar  1 23:14 nohup.out

[hadoop@localhost hadoop]$ du -sh .
2.0G    .

[hadoop@localhost hadoop]$ du -sk .
2092064 .
[hadoop@localhost hadoop]$

5) Detect and set the java home. Run the below command on the centos machine, look for java.home and use that value to set JAVA_HOME in hadoop's .bash_profile file.

java -XshowSettings:properties -version

JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.282.b08-1.el7_9.x86_64/jre
export JAVA_HOME

6) Now launch a new session of hadoop user and follow the below steps [https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SingleCluster.html]

    a) Goto the untarred directory of the hadoop-3.3.0

[hadoop@localhost hadoop-3.3.0]$ ls -altr
total 84
-rw-r--r--.  1 hadoop bigdata   175 Mar 24  2020 README.txt
-rw-r--r--.  1 hadoop bigdata  1541 Mar 24  2020 NOTICE.txt
-rw-r--r--.  1 hadoop bigdata 27570 Mar 24  2020 NOTICE-binary
-rw-r--r--.  1 hadoop bigdata 15697 Mar 24  2020 LICENSE.txt
-rw-r--r--.  1 hadoop bigdata 22976 Jul  4  2020 LICENSE-binary
drwxr-xr-x.  3 hadoop bigdata  4096 Jul  6  2020 sbin
drwxr-xr-x.  3 hadoop bigdata    20 Jul  6  2020 etc
drwxr-xr-x.  2 hadoop bigdata  4096 Jul  6  2020 licenses-binary
drwxr-xr-x.  3 hadoop bigdata    20 Jul  6  2020 lib
drwxr-xr-x. 10 hadoop bigdata   215 Jul  6  2020 .
drwxr-xr-x.  2 hadoop bigdata   203 Jul  6  2020 bin
drwxr-xr-x.  2 hadoop bigdata   106 Jul  6  2020 include
drwxr-xr-x.  4 hadoop bigdata   288 Jul  6  2020 libexec
drwxr-xr-x.  4 hadoop bigdata    31 Jul  6  2020 share
drwxr-xr-x.  3 hadoop bigdata    67 Mar  1 23:13 ..
[hadoop@localhost hadoop-3.3.0]$

[hadoop@localhost hadoop-3.3.0]$ pwd
/opt/hadoop/hadoop-3.3.0
[hadoop@localhost hadoop-3.3.0]$

    b) Create input directory and copy *.xml files from etc/hadoop directory

[hadoop@localhost hadoop-3.3.0]$ mkdir input

[hadoop@localhost hadoop-3.3.0]$ ls -ld input
drwxr-xr-x. 2 hadoop bigdata 6 Mar  2 01:13 input
[hadoop@localhost hadoop-3.3.0]$ cp etc/hadoop/*.xml input/
[hadoop@localhost hadoop-3.3.0]$

    c) Launch the below command to let the example copy the unpacked conf directory to use as input and then finds and displays every match of the given regular expression. Output is written to the given output directory.

bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.0.jar grep input output 'dfs[a-z.]+'

Output: 

2021-03-02 01:24:00,805 INFO impl.MetricsConfig: Loaded properties from hadoop-metrics2.properties
2021-03-02 01:24:00,932 INFO impl.MetricsSystemImpl: Scheduled Metric snapshot period at 10 second(s).
2021-03-02 01:24:00,932 INFO impl.MetricsSystemImpl: JobTracker metrics system started
2021-03-02 01:24:01,129 INFO input.FileInputFormat: Total input files to process : 10
2021-03-02 01:24:01,172 INFO mapreduce.JobSubmitter: number of splits:10
2021-03-02 01:24:01,460 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local440722701_0001
2021-03-02 01:24:01,460 INFO mapreduce.JobSubmitter: Executing with tokens: []
2021-03-02 01:24:01,680 INFO mapreduce.Job: The url to track the job: http://localhost:8080/
2021-03-02 01:24:01,680 INFO mapreduce.Job: Running job: job_local440722701_0001
2021-03-02 01:24:01,686 INFO mapred.LocalJobRunner: OutputCommitter set in config null
...
2021-03-02 01:24:04,925 INFO mapreduce.Job:  map 100% reduce 100%
2021-03-02 01:24:04,926 INFO mapreduce.Job: Job job_local2096128567_0002 completed successfully
2021-03-02 01:24:04,931 INFO mapreduce.Job: Counters: 30
        File System Counters
                FILE: Number of bytes read=1203532
                FILE: Number of bytes written=3576646
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
        Map-Reduce Framework
                Map input records=1
                Map output records=1
                Map output bytes=17
                Map output materialized bytes=25
                Input split bytes=127
                Combine input records=0
                Combine output records=0
                Reduce input groups=1
                Reduce shuffle bytes=25
                Reduce input records=1
                Reduce output records=1
                Spilled Records=2
                Shuffled Maps =1
                Failed Shuffles=0
                Merged Map outputs=1
                GC time elapsed (ms)=41
                Total committed heap usage (bytes)=273997824
        Shuffle Errors
                BAD_ID=0
                CONNECTION=0
                IO_ERROR=0
                WRONG_LENGTH=0
                WRONG_MAP=0
                WRONG_REDUCE=0
        File Input Format Counters
                Bytes Read=123
        File Output Format Counters
                Bytes Written=23
[hadoop@localhost hadoop-3.3.0]$

    d) Verify the output

[hadoop@localhost hadoop-3.3.0]$ cat output/*
1       dfsadmin
[hadoop@localhost hadoop-3.3.0]$

[hadoop@localhost hadoop-3.3.0]$ ls -altr output/*
-rw-r--r--. 1 hadoop bigdata 11 Mar  2 01:24 output/part-r-00000
-rw-r--r--. 1 hadoop bigdata  0 Mar  2 01:24 output/_SUCCESS
[hadoop@localhost hadoop-3.3.0]$ cat output/part-r-00000
1       dfsadmin
[hadoop@localhost hadoop-3.3.0]$

# We are done with Standalone mode of operation. We will see Pseudo-Distributed Operation in seperate blog.

Sunday, February 28, 2021

Adding a linux user/group and modifying user's primary group

Adding a linux user/group and modifying the primary group of the user.

Steps:

[root@localhost ~]# groupadd bigdata

[root@localhost ~]# grep -i bigdata /etc/group
bigdata:x:1001:
[root@localhost ~]#

[root@localhost ~]# useradd -m hadoop

[root@localhost ~]# id hadoop
uid=1001(hadoop) gid=1002(hadoop) groups=1002(hadoop)

[root@localhost ~]# ls -altr /home/hadoop
total 12
-rw-r--r--. 1 hadoop hadoop 231 Mar 31  2020 .bashrc
-rw-r--r--. 1 hadoop hadoop 193 Mar 31  2020 .bash_profile
-rw-r--r--. 1 hadoop hadoop  18 Mar 31  2020 .bash_logout
drwxr-xr-x. 4 root   root    35 Mar  1 02:00 ..
drwx------. 2 hadoop hadoop  62 Mar  1 02:00 .

[root@localhost ~]# usermod -g bigdata hadoop

[root@localhost ~]# ls -altr /home/hadoop
total 12
-rw-r--r--. 1 hadoop bigdata 231 Mar 31  2020 .bashrc
-rw-r--r--. 1 hadoop bigdata 193 Mar 31  2020 .bash_profile
-rw-r--r--. 1 hadoop bigdata  18 Mar 31  2020 .bash_logout
drwxr-xr-x. 4 root   root     35 Mar  1 02:00 ..
drwx------. 2 hadoop bigdata  62 Mar  1 02:00 .

[root@localhost ~]#

[root@localhost ~]# passwd hadoop
Changing password for user hadoop.
New password:
BAD PASSWORD: The password is shorter than 8 characters
Retype new password:
passwd: all authentication tokens updated successfully.
[root@localhost ~]#

<<<<<<<<<<< set a strong password

Now logged in as hadoop.

Last failed login: Mon Mar  1 02:06:41 EST 2021 from gateway on ssh:notty
Last login: Mon Mar  1 02:06:20 2021
[hadoop@localhost ~]$

[hadoop@localhost ~]$ ls -altr
total 16
-rw-r--r--. 1 hadoop bigdata 231 Mar 31  2020 .bashrc
-rw-r--r--. 1 hadoop bigdata 193 Mar 31  2020 .bash_profile
-rw-r--r--. 1 hadoop bigdata  18 Mar 31  2020 .bash_logout
drwxr-xr-x. 4 root   root     35 Mar  1 02:00 ..
-rw-------. 1 hadoop bigdata   5 Mar  1 02:06 .bash_history
drwx------. 2 hadoop bigdata  83 Mar  1 02:06 .
[hadoop@localhost ~]$

This closes this blog.Thanks

Vagrant box - authentication issues

Vagrant box - authentication issues

Box Os: Centos 7.9
Kernel: 3.10.0-1160.6.1.el7.x86_64

Symptoms:
1) Authentication Failure after setting up the box with insecure key on first reboot (vagrant halt/up/ssh). This is expected since the Vagrant will replace the insecure key on first boot.

    default: SSH auth method: private key
    default: Warning: Authentication failure. Retrying...
    default: Warning: Authentication failure. Retrying...

2) Even after generating a new key using the below method and copying the public key to the vagrant box the authorized_keys file will become zero bytes on restart (vagrant halt/up). This leads to authentication failure again.

$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/<useracnt>/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/<useracnt>/.ssh/id_rsa
Your public key has been saved in /home/<useracnt>/.ssh/id_rsa.pub
The key fingerprint is:
SHA256:7KVti+u55MszInXV0py5gkjyAucdhfWoE3R5NtgqUcs <useracnt>@<machineid>
The key's randomart image is:
+---[RSA 3072]----+
|      .+++       |
|     .oo++=      |
|      .oE+.= o   |
|  . o ++. o *    |
|   + =o+So.. .   |
|    o =oo+. .    |
|     o .+ o.     |
|    . .+o+ .     |
|     . oX*.      |
+----[SHA256]-----+

3) We then set config.ssh.insert_key = false in Vagranfile, this ensures the authorized_keys arent replaced. But still we get encryption related error as below.

The private key you're attempting to use with this Vagrant box uses
an unsupported encryption type. The SSH library Vagrant uses does not support
this key type. Please use `ssh-rsa` or `ssh-dss` instead. Note that
sometimes keys in your ssh-agent can interfere with this as well,
so verify the keys are valid there in addition to standard
file paths.

4) We then generate rsa key with advanced option as below... which generated RSA 4096 instead of RSA 3072. We copy the public key using ssh-copy-id, still the encryption message is thrown.

$ ssh-keygen -t rsa -b 4096
Generating public/private rsa key pair.
Enter file in which to save the key (/home/<useracnt>/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/<useracnt>/.ssh/id_rsa
Your public key has been saved in /home/<useracnt>/.ssh/id_rsa.pub
The key fingerprint is:
SHA256:hrSfT0367PoPlEMojHJZGJIbKdS9JadCDkdWiR2ayVs <useracnt>@<machineid>
The key's randomart image is:
+---[RSA 4096]----+
| ..o+B+=.        |
|  ooO=*=o  .     |
|   **oE*o . .    |
|    +*oo . . .   |
|    ..o S   =    |
|       o . = .   |
|        o o o    |
|         o o .   |
|          o+=..  |
+----[SHA256]-----+

Error:

The private key you're attempting to use with this Vagrant box uses
an unsupported encryption type. The SSH library Vagrant uses does not support
this key type. Please use `ssh-rsa` or `ssh-dss` instead. Note that
sometimes keys in your ssh-agent can interfere with this as well,
so verify the keys are valid there in addition to standard
file paths.

Now Solution:


5) Now after some search, we go through several solutions, none worked and I dont want to workaround the issue. Finally hit https://stackoverflow.com/questions/22922891/vagrant-ssh-authentication-failure solution offered by Sankari steps followed, it worked until resolving the encryption issue, one additional step as detailed below was needed for the fix.

    a. Generate the ssh-key pair fresh in the client centos machine...

[vagrant@localhost ~]$ ssh-keygen -t rsa -b 4096
Generating public/private rsa key pair.
Enter file in which to save the key (/home/vagrant/.ssh/id_rsa):
/home/vagrant/.ssh/id_rsa already exists.
Overwrite (y/n)? y
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/vagrant/.ssh/id_rsa.
Your public key has been saved in /home/vagrant/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:9FVGrkwfmAn7XD5jMBJ+tkG6C3zWLv4BcX/vlSkuYzI vagrant@localhost.localdomain
The key's randomart image is:
+---[RSA 4096]----+
|          o ..+  |
|         . * B   |
|        . * & +  |
|       o . # @ . |
|        S * B B .|
|         + + . ++|
|          o o. oo|
|         E =.....|
|          =.+.  .|
+----[SHA256]-----+

    b. Copy the public key to authorized_keys with 0600 permisson.
    c. Copy/Paste the private key to local cygwin64's .ssh/ directory.
    d. Try vagrant ssh now, it is still attempting to use the below private key [so the stackoverflow solution steps stop here]

debug1: Will attempt key: <windowsdrive>:/VBoxVms/vagrant_home/vcentos79_hivetest/.vagrant/machines/default/virtualbox/private_key  explicit

    e. Add the below config setting to Vagrantfile config.ssh.private_key_path = ['~/.ssh/id_rsa'], this seems to have done the trick.

debug1: Trying private key: <windowsdrive>:/cygwin64/home/<useracnt>/.ssh/id_rsa

    f. Now the vagrant ssh works fine.

debug1: client_input_global_request: rtype hostkeys-00@openssh.com want_reply 0
Last login: Mon Mar  1 01:01:08 2021 from gateway
[vagrant@localhost ~]$

    g. Try a vagrant halt/up. Should be fine.

$ vagrant halt
==> default: Attempting graceful shutdown of VM...
<useracnt>@<machineid> ~/dba/vcentos79_hivetest

$ vagrant up
Bringing machine 'default' up with 'virtualbox' provider...
==> default: Clearing any previously set forwarded ports...
==> default: Clearing any previously set network interfaces...
==> default: Preparing network interfaces based on configuration...
    default: Adapter 1: nat
==> default: Forwarding ports...
    default: 22 (guest) => 2222 (host) (adapter 1)
==> default: Booting VM...
==> default: Waiting for machine to boot. This may take a few minutes...
    default: SSH address: 127.0.0.1:2222
    default: SSH username: vagrant
    default: SSH auth method: private key
==> default: Machine booted and ready!
==> default: Checking for guest additions in VM...
==> default: Mounting shared folders...
    default: /vagrant => <windowsdrive>:/VBoxVms/vagrant_home/vcentos79_hivetest
==> default: Machine already provisioned. Run `vagrant provision` or use the `--provision`
==> default: flag to force provisioning. Provisioners marked to run always will still run.
<useracnt>@<machineid> ~/dba/vcentos79_hivetest


This closes this blog. Thank you for your visit.

Thursday, January 14, 2021

How to stop/start a postgres db

Lab: How to stop/start a postgres db

As postgres user (recommendation is never allow direct connection to postgres, allow other users with su to switch to postgres user).


Start:

[postgres@localhost ~]$ pg_ctl -D /pgData/data -l logfile start
waiting for server to start.... done
server started

[postgres@localhost ~]$
Where /pgData/data is the datafile directory where the postgres database is stored
Examine the postgres db status:

[postgres@localhost ~]$ ps -ef|grep -i postgres
root      9284  9269  0 02:40 pts/0    00:00:00 su - postgres
postgres  9285  9284  0 02:40 pts/0    00:00:00 -bash
postgres  9319     1  0 02:42 ?        00:00:00 /pgBin/12.5/bin/postgres -D /pgData/data
postgres  9321  9319  0 02:42 ?        00:00:00 postgres: checkpointer
postgres  9322  9319  0 02:42 ?        00:00:00 postgres: background writer
postgres  9323  9319  0 02:42 ?        00:00:00 postgres: walwriter
postgres  9324  9319  0 02:42 ?        00:00:00 postgres: autovacuum launcher
postgres  9325  9319  0 02:42 ?        00:00:00 postgres: stats collector
postgres  9326  9319  0 02:42 ?        00:00:00 postgres: logical replication launcher
postgres  9327  9285  0 02:42 pts/0    00:00:00 ps -ef
postgres  9328  9285  0 02:42 pts/0    00:00:00 grep --color=auto -i postgres
[postgres@localhost ~]$

9319 is the parent process for the postgres process tree

[postgres@localhost ~]$ pg_ctl status -D /pgData/data
pg_ctl: server is running (PID: 9319)
/pgBin/12.5/bin/postgres "-D" "/pgData/data"
[postgres@localhost ~]$

Stop:

[postgres@localhost ~]$ pg_ctl -D /pgData/data stop
waiting for server to shut down.... done
server stopped
[postgres@localhost ~]$

Status:

[postgres@localhost ~]$ pg_ctl status -D /pgData/data
pg_ctl: no server running
[postgres@localhost ~]$


Steps to enable postgresql database to accept remote connections

Lab Excercise: Try to connect to our test database using test role/user through  pgAdmin version 4.29 console when postgres configured in a virtualbox with NAT network(port forwarded)

We need to do the following if we use virtualbox with NAT network alone.

OS details: Centos 7.9

Steps:

1) NAT network allow internet access but it doesnt allow incoming connections. Hence we setup port forwarding for it to work. Add a port forwarding in virtualbox console for the vm which we are testing.

127.0.0.1-54320 => VM's 5432 (ex. port for postgres)

2) Backup & Edit postgres.conf file, then enable both port & listen_address to '*'
3) Backup & Edit pg_hba.conf file, then make the below entry

host    all    all    10.0.0.0/16    md5

4) stop the firewall or add exception
a) stop the firewall - pgAdmin connects perfectly fine
b) Add exception via 
i) service [see below] - didnt work as expected the pgAdmin console was timing out always. Reason could be the netstat -plantu or ss -plantu didnt show the postgresql service against port we have postgres running, insteas it showed postgres. 

But firewall-cmd --get-service was showing postgresql as the service.
[root@localhost ~]# firewall-cmd --add-service=postgresql --permanent
success
[root@localhost ~]# firewall-cmd --list-all
public (active)
  target: default
  icmp-block-inversion: no
  interfaces: enp0s3
  sources:
  services: dhcpv6-client ssh
  ports:
  protocols:
  masquerade: no
  forward-ports:
  source-ports:
  icmp-blocks:
  rich rules:
[root@localhost ~]# firewall-cmd --reload
success
[root@localhost ~]# firewall-cmd --list-all
public (active)
  target: default
  icmp-block-inversion: no
  interfaces: enp0s3
  sources:
  services: dhcpv6-client postgresql ssh
  ports:
  protocols:
  masquerade: no
  forward-ports:
  source-ports:
  icmp-blocks:
  rich rules:
[root@localhost ~]#
[root@localhost ~]# firewall-cmd --get-service|tr " " "\n"
..
plex
pmcd
pmproxy
pmwebapi
pmwebapis
pop3
pop3s
postgresql
privoxy
..
zabbix-server
[root@localhost ~]#

ii) Try adding the port to exception - pgAdmin connects fine.

[root@localhost ~]# firewall-cmd --add-port=<port>/tcp --permanent
success
[root@localhost ~]# firewall-cmd --reload
success
[root@localhost ~]# firewall-cmd --list-all
public (active)
  target: default
  icmp-block-inversion: no
  interfaces: enp0s3
  sources:
  services: dhcpv6-client ssh
  ports: <port>/tcp
  protocols:
  masquerade: no
  forward-ports:
  source-ports:
  icmp-blocks:
  rich rules:
[root@localhost ~]#

Thanks for your visit to the blog.

Saturday, January 9, 2021

Postgres Custom Installation to the directory we choose instead going with default rpm install [/usr/local/pgsql]

 Postgres Custom Installation to the directory we choose instead going with default rpm install [/usr/local/pgsql]

Environment: 

Centos 7.9 - x86_64

Vagrant box: 

Custom build box

Postgres version: 

12.5 (latest in rel 12 by the time of this documentation)

Install doc: 

https://www.postgresql.org/docs/12/install-procedure.html

List of options decided to customize:

1) Path of the postgres binary [--prefix=PREFIX]

2) Other options to include in configure [--with-pgport=1512 for version 12, --with-perl,--with-python for pl/perl & pl/python support]

Options understood but considered for later testing:

1) --with-segsize=SEGSIZE
2) --with-blocksize=BLOCKSIZE
3) --with-wal-blocksize=BLOCKSIZE
4) --enable-dtrace

Leaving them at their default values.

Postgres recommended options not considered for now:

This install is majorly focused on administration side learning rather than development.

1) --enable-debug
2) --enable-cassert

Prerequisites:

https://www.postgresql.org/docs/12/install-requirements.html

make version 3.80 or newer
GCC
tar, gzip, bzip2
Perl 5.8.3
Python 2.4 min.
openssl

Not a exhaustive list, but the ones I am intrested in.

Verification of the precheck:

[root@localhost ~]# make --version
GNU Make 3.82
Built for x86_64-redhat-linux-gnu
Copyright (C) 2010  Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

[root@localhost ~]# which openssl
/usr/bin/openssl

[root@localhost ~]# perl --version

This is perl 5, version 16, subversion 3 (v5.16.3) built for x86_64-linux-thread-multi
(with 41 registered patches, see perl -V for more detail)

[root@localhost ~]# python --version
Python 2.7.5
[root@localhost ~]#

So I have all the packages I need.

Directory structure and memory foot print:


[root@localhost ~]# df -h
Filesystem               Size  Used Avail Use% Mounted on
/dev/sdb1                2.1G   33M  2.1G   2% /tmp
/dev/mapper/dbvg-db00    2.0G   33M  2.0G   2% /pgBin
/dev/mapper/dbvg-db01    2.0G   33M  2.0G   2% /pgData
/dev/mapper/dbvg-db03    2.0G   33M  2.0G   2% /pgBackup
/dev/mapper/dbvg-db02    2.0G   33M  2.0G   2% /pgWal

[root@localhost ~]# free -k
              total        used        free      shared  buff/cache   available
Mem:        1882080      201560     1476888        8792      203632     1528440
Swap:       3774456           0     3774456
[root@localhost ~]#

Steps to install the postgres binary:

Step 1) Source download

Download the binary from https://www.postgresql.org/docs/12/install-getsource.html => https://www.postgresql.org/ftp/source/v12.5/

Step 2)  Decompress and unatar

[root@localhost pgbin]# ls -altr
total 133628
drwxrwxrwx. 1 vagrant vagrant      4096 Nov  9 17:39 postgresql-12.5
-rwxrwxrwx. 1 vagrant vagrant 136826880 Jan  9 03:46 postgresql-12.5.tar
drwxrwxrwx. 1 vagrant vagrant      4096 Jan  9 03:50 ..
drwxrwxrwx. 1 vagrant vagrant         0 Jan  9 03:51 .
[root@localhost pgbin]# find . -name 'configure'
./postgresql-12.5/configure
[root@localhost pgbin]#

Step 3) Create necessary directory

Create directory 12.5 under /pgBin mountpoint

Step 4) Launch configure command with necessary flags

cd /pgBin/src12.5/postgresql-12.5/
./configure --prefix=/pgBin/12.5 --with-pgport=1512 --with-perl --with-python

You may want to consider installing the below listed packages based on error.

Extra packages installing: 
yum install perl-ExtUtils-MakeMaker perl-ExtUtils-Embed
yum install readline-devel
yum install zlib-devel
yum install python-devel

You can examine config.log or on screen message, the config once finished successful move to the next step.

We have the below list of files freshly created:

-rw-r--r--.  1 root root  83906 Jan  9 04:19 config.log_readline
-rw-r--r--.  1 root root  71852 Jan  9 06:53 config.log_zlibdevel
-rwxr-xr-x.  1 root root  40367 Jan  9 07:01 config.status
-rw-r--r--.  1 root root   3998 Jan  9 07:01 GNUmakefile
drwxrwxrwx. 16 1107 1107   4096 Jan  9 07:01 src
-rw-r--r--.  1 root root 442216 Jan  9 07:01 config.log
drwxrwxrwx.  6 1107 1107   4096 Jan  9 07:01 .
[root@localhost postgresql-12.5]#

Complete list:

[root@localhost postgresql-12.5]# find . -mtime -1 -ls
2117696    4 drwxrwxrwx   6 1107     1107         4096 Jan  9 07:01 .
1316371    4 drwxrwxrwx  16 1107     1107         4096 Jan  9 07:01 ./src
3146075    4 drwxrwxrwx  31 1107     1107         4096 Jan  9 07:01 ./src/include
3148247   36 -rw-r--r--   1 root     root        32788 Jan  9 07:01 ./src/include/pg_config.h
3148248    4 -rw-r--r--   1 root     root            1 Jan  9 07:01 ./src/include/stamp-h
3148249    4 -rw-r--r--   1 root     root          323 Jan  9 07:01 ./src/include/pg_config_ext.h
3148250    4 -rw-r--r--   1 root     root            1 Jan  9 07:01 ./src/include/stamp-ext-h
3148252    0 lrwxrwxrwx   1 root     root           30 Jan  9 07:01 ./src/include/pg_config_os.h -> ../../src/include/port/linux.h
1317040    4 drwxrwxrwx   2 1107     1107         4096 Jan  9 07:01 ./src/interfaces/ecpg/include
3148251    4 -rw-r--r--   1 root     root          572 Jan  9 07:01 ./src/interfaces/ecpg/include/ecpg_config.h
1319139    4 -rw-r--r--   1 root     root            1 Jan  9 07:01 ./src/interfaces/ecpg/include/stamp-h
2119058    0 drwxrwxrwx   6 1107     1107          252 Jan  9 07:01 ./src/backend/port
2119591    0 lrwxrwxrwx   1 root     root           37 Jan  9 07:01 ./src/backend/port/tas.s -> ../../../src/backend/port/tas/dummy.s
2119592    0 lrwxrwxrwx   1 root     root           38 Jan  9 07:01 ./src/backend/port/pg_sema.c -> ../../../src/backend/port/posix_sema.c
2119593    0 lrwxrwxrwx   1 root     root           38 Jan  9 07:01 ./src/backend/port/pg_shmem.c -> ../../../src/backend/port/sysv_shmem.c
3148246   36 -rw-r--r--   1 root     root        34059 Jan  9 07:01 ./src/Makefile.global
1319140    0 lrwxrwxrwx   1 root     root           31 Jan  9 07:01 ./src/Makefile.port -> ../src/makefiles/Makefile.linux
2119586   84 -rw-r--r--   1 root     root        83906 Jan  9 04:19 ./config.log_readline
2119587   72 -rw-r--r--   1 root     root        71852 Jan  9 06:53 ./config.log_zlibdevel
2119588  432 -rw-r--r--   1 root     root       442216 Jan  9 07:01 ./config.log
2119590   40 -rwxr-xr-x   1 root     root        40367 Jan  9 07:01 ./config.status
3148245    4 -rw-r--r--   1 root     root         3998 Jan  9 07:01 ./GNUmakefile
[root@localhost postgresql-12.5]#

Examining a file to see what it has...

./src/Makefile.global

Snipped content: This shows that we chose pgBin/12.5 for our base path in the binary install and postgres configure created a make file which will be later used in the make process.
==========================
prefix := /pgBin/12.5
exec_prefix := ${prefix}
datarootdir := ${prefix}/share

bindir := ${exec_prefix}/bin
==========================

Step 5) Launch make command

make world

Other options are make or make all. Since make world included better coverage chose make world over all or just make.

Output: 
PostgreSQL, contrib, and documentation successfully made. Ready to install.

[root@localhost ~]# ls -altr /pgBin/12.5/
total 0
drwxr-xr-x. 2 root root  6 Jan  9 03:55 .
drwxr-xr-x. 4 root root 33 Jan  9 04:08 ..
[root@localhost ~]#

<<< so this directory is still empty

Step 6) make check

As non root user this step has to be done, so I chose vagrant which is the only other user I have in this step.

=======================
 All 194 tests passed.
=======================

this step created the below directory under src binary....

drwxrwxr-x.  4 vagrant vagrant     30 Jan  9 07:43 tmp_install

Step 7) make install-world

make install-world

Output:
...
PostgreSQL, contrib, and documentation installation complete.

<<<<<<<<<<<<<<<<< This completes installation.

Post Install

Step 1) Environment Variable config:

 The step below will make runtime library [after installation to enable the run-time linker to find the shared libraries faster].
Optional : /sbin/ldconfig /pgBin/12.5/lib

Environment variable to go into ~/.bash_profile

LD_LIBRARY_PATH=/pgBin/12.5/lib
export LD_LIBRARY_PATH
PATH=/pgBin/12.5/bin:$PATH
export PATH
MANPATH=/pgBin/12.5/share/man:$MANPATH
export MANPATH

Step 2) Adding postgres user

[root@localhost pgData]# adduser postgres
[root@localhost pgData]# id postgres
uid=1001(postgres) gid=1001(postgres) groups=1001(postgres)
[root@localhost pgData]#

Step 3) changing ownership of the data directory

mkdir /pgData/data
chown postgres:postgres /pgData/data

Step 4) Initialize the first postgres database

su - postgres

.bash_profile is edited with the variables we formed in step 1 of post install.

initdb -D /pgData/data
pg_ctl -D /pgData/data -l logfile start
createdb test
psql test

Output below:

[postgres@localhost ~]$ initdb -D /pgData/data
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.

fixing permissions on existing directory /pgData/data ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... America/New_York
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok

initdb: warning: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.

Success. You can now start the database server using:

    pg_ctl -D /pgData/data -l logfile start

[postgres@localhost ~]$ pg_ctl -D /pgData/data -l logfile start
waiting for server to start.... done
server started
[postgres@localhost ~]$ createdb test
[postgres@localhost ~]$ psql test
psql (12.5)
Type "help" for help.

test=# \c
You are now connected to database "test" as user "postgres".
test=# \l
 postgres  | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
 template0 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
 template1 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
 test      | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 |

test=#

Flashback data archive steps

 Objective: Ways to track DML changes in a table Detailed objective: We should be able to track the changes in the table at any point in tim...