
I have talked about the open source CMS and Ubuntu 16.04, the details of which you can read in these articles:
- Installing Fuel CMS on Ubuntu 16.04 LTS : A Complete Guide
- How to Install BlogoText CMS on an Ubuntu 16.04 LAMP VPS
Apache Zeppelin is a web-based open source notebook and collaborative tool for interactive data ingestion, discovery, analytics, and visualization. Zeppelin supports more than 20 languages including Apache Spark, SQL, R, Elasticsearch, and more. Apache Zeppelin allows you to create beautiful, data-driven documents and see the results of your analytics.
Prerequisites
- A DreamVPS Ubuntu 16.04 server instance.
- A sudo user.
- A domain name pointed towards the server.
For this tutorial, we will use ‘zeppelin.example.com’ as the domain name pointed towards the ‘Vultr’ instance. Please make sure to replace all occurrences of the example domain name with your actual one.
Install Java
Apache Zeppelin is written in Java, thus it requires JDK to work. Add the Ubuntu repository for Oracle Java 8.
sudo add-apt-repository --yes ppa:webupd8team/java
sudo apt updateInstall Oracle Java.
sudo apt -y install oracle-java8-installerVerify its version.
java -versionYou should see the following output.
user@dreamvps:~$ java -version java version "1.8.0_161" Java(TM) SE Runtime Environment (build 1.8.0_161-b12) Java HotSpot(TM) 64-Bit Server VM (build 25.161-b12, mixed mode)
Set the default path for the Java by installing the following package.
sudo apt -y install oracle-java8-set-defaultYou can verify if ‘JAVA_HOME’ is set by running the below.
echo $JAVA_HOMEYou should see the following.
user@dreamvps:~$ echo $JAVA_HOME
/usr/lib/jvm/java-8-oracleIf you see no output at all, you will need to log out from the current shell and log back in.
Install Zeppelin
Apache Zeppelin ships all of the dependencies along with the binary files, so you do not need to install anything else except Java. Download the Zeppelin binary on your system; you can find the latest version of the application on Zeppelin download page.
wget http://www-us.apache.org/dist/zeppelin/zeppelin-0.7.3/zeppelin-0.7.3-bin-all.tgzExtract the archive.
sudo tar xf zeppelin-*-bin-all.tgz -C /optThe above command will extract the archive to ‘/opt/zeppelin-0.7.3-bin-all’. Rename the directory for the sake of convenience.
sudo mv /opt/zeppelin-*-bin-all /opt/zeppelinApache Zeppelin is now installed. You can immediately start the application, however, it will not be accessible to you as it listens to localhost only. You can configure Apache Zeppelin as a service. You can also configure Nginx as a reverse proxy.
Configure Systemd
In this step, we will set up a Systemd unit file for the Zeppelin application. This will ensure that the application process is automatically started on system restarts as well as failures.
For security reasons, create an unprivileged user for running the Zeppelin process.
sudo useradd -d /opt/zeppelin -s /bin/false zeppelinProvide ownership of the files to the newly created Zeppelin user.
sudo chown -R zeppelin:zeppelin /opt/zeppelinCreate a new Systemd service unit file.
sudo nano /etc/systemd/system/zeppelin.serviceFill the file with the following information.
[Unit] Description=Zeppelin service After=syslog.target network.target [Service] Type=forking ExecStart=/opt/zeppelin/bin/zeppelin-daemon.sh start ExecStop=/opt/zeppelin/bin/zeppelin-daemon.sh stop ExecReload=/opt/zeppelin/bin/zeppelin-daemon.sh reload User=zeppelin Group=zeppelin Restart=always [Install] WantedBy=multi-user.target
Start the application.
sudo systemctl start zeppelinEnable the Zeppelin service to automatically start at boot time.
sudo systemctl enable zeppelinTo ensure that the service is running, you can run the following.
sudo systemctl status zeppelin
Configure Reverse Proxy
By default, the Zeppelin server listens to the localhost on port 8080. In this example, we will use Nginx as a reverse proxy so that the application can be accessed via standard HTTP and HTTPS ports. We will also configure Nginx to use an SSL generated with Let’s Encrypt free SSL CA.
Install Nginx.
sudo apt -y install nginxStart Nginx and enable it to automatically start at boot time.
sudo systemctl start nginx
sudo systemctl enable nginxAdd the Certbot repository.
sudo add-apt-repository --yes ppa:certbot/certbot
sudo apt-get updateInstall Certbot, which is the client application for Let’s Encrypt CA.
sudo apt -y install certbotNote: To obtain certificates from Let’s Encrypt CA, the domain for which the certificates are to be generated must be pointed towards the server. If not, make the necessary changes to the DNS records of the domain and wait for the DNS to propagate before making the certificate request again. Certbot checks the domain authority before providing the certificates.
Generate the SSL certificates.
sudo certbot certonly --webroot -w /var/www/html -d zeppelin.example.comThe generated certificates are likely to be stored in ‘/etc/letsencrypt/live/zeppelin.example.com/’. The SSL certificate will be stored as ‘fullchain.pem’ and private key will be stored as ‘privkey.pem’.
Let’s Encrypt certificates expire in 90 days, hence it is recommended to set up auto-renewal of the certificates using Cron jobs.
Open the cron job file.
sudo crontab -eAdd the following line at the end of the file.
30 5 * * * /usr/bin/certbot renew --quietThe above cron job will run every day at 5:30 AM. If the certificate is due for expiration, it will automatically be renewed.
Create a new server block file for the Zeppelin site.
sudo nano /etc/nginx/sites-available/zeppelinFill the file with the following information.
upstream zeppelin {
server 127.0.0.1:8080;
}
server {
listen 80;
server_name zeppelin.example.com;
return 301 https://$host$request_uri;
}
server {
listen 443;
server_name zeppelin.example.com;
ssl_certificate /etc/letsencrypt/live/zeppelin.example.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/zeppelin.example.com/privkey.pem;
ssl on;
ssl_session_cache builtin:1000 shared:SSL:10m;
ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
ssl_ciphers HIGH:!aNULL:!eNULL:!EXPORT:!CAMELLIA:!DES:!MD5:!PSK:!RC4;
ssl_prefer_server_ciphers on;
access_log /var/log/nginx/zeppelin.access.log;
location / {
proxy_pass http://zeppelin;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header Host $http_host;
proxy_set_header X-NginX-Proxy true;
proxy_redirect off;
}
location /ws {
proxy_pass http://zeppelin/ws;
proxy_http_version 1.1;
proxy_set_header Upgrade websocket;
proxy_set_header Connection upgrade;
proxy_read_timeout 86400;
}
}Activate the configuration file.
sudo ln -s /etc/nginx/sites-available/zeppelin /etc/nginx/sites-enabled/zeppelinRestart Nginx so that the changes can take effect.
sudo systemctl restart nginx zeppelinZeppelin is now accessible on the following address.
‘https://zeppelin.example.com’
By default, there is no authentication enabled, so you can use the application directly.
Since the application is accessible to everyone, the notebooks you create are also accessible to everyone. It is very important to disable anonymous access and enable authentication so that only the authenticated users can access the application.
Disable Anonymous Access
To disable the default anonymous access, copy the configuration file template to its live location.
cd /opt/zeppelin
sudo cp conf/zeppelin-site.xml.template conf/zeppelin-site.xmlEdit the configuration file.
sudo nano conf/zeppelin-site.xmlFind the following lines in the file.
<property> <name>zeppelin.anonymous.allowed</name> <value>true</value>
Change the value to ‘false’ in order to disable the anonymous access.
Enable Shiro Authentication
Now that you have disabled the anonymous access, you need to enable some kind of authentication mechanism so that privileged users can log in. Apache Zeppelin uses Apache Shiro authentication. Copy the ‘Shiro’ configuration file.
sudo cp conf/shiro.ini.template conf/shiro.iniEdit the configuration file.
sudo nano conf/shiro.iniFind the following lines in the file.
[users] admin = password1, admin user1 = password2, role1, role2 user2 = password3, role3 user3 = password4, role2
The list contains the username, password, and roles of the users. For now, we will only use ‘admin’ and ‘user1’. Change the password of admin and user1 and disable the other users by commenting them. You can also change the username and roles of the users. To learn more about Apache Shiro users and roles, read the Shiro authorization guide.
Once you have changed the passwords, the code block should look like this.
[users] admin = StrongPassword, admin user1 = UserPassword, role1, role2 # user2 = password3, role3 # user3 = password4, role2
Now restart Zeppelin to apply the changes.
sudo systemctl restart zeppelinYou will see that the authentication has been enabled and you will be able to log in using the username and password set in the Shiro configuration file.